Open lagbolt opened 2 years ago
Technical info for implementers/others:
The following is a description of the format of the files mentioned above:
"\n"
and preceded by a JSON string composed only of non-LF characters — the LF is the line delimiter and the file ends after the last line: so the file ends with a delimiter character
Because there are quite a few files at the URL shared in the original message, I'll include a copy of one (of the "smaller more manageable files") here for reference:
from: https://id.loc.gov/download/vocabulary/iso639-1.madsrdf.jsonld.gz
retrieved at: 2022-11-29T08:46:46.853Z
size: ~30.5 KB
1 JSON Lines is synonymous with the NDJSON spec (and perhaps other names). JSON Lines and NDJSON might even be merging in the future — see these issues:
The Library of Congress publishes a LOT of data (e.g., names and subject headings) in a format of one JSON object per line.
It would be great if you could handle this format.
You can download sample files from https://id.loc.gov/download/ -- the smaller more manageable files are lower down the page.