Open pkiraly opened 1 year ago
I'd prefer to transform the UNIMARC schema to Avram and use Avram for both PICA and UNIMARC. I'll start with a transformation.
@nichtich Please do not start that yet. 1) I would like to ask a student who will do this as part of a thesis 2) I am in a discussion with an UNIMARC expert, because it seems that this machine readable version contains information only about subfields, but not about fields, and indicators - so it seems that the process requires manual work as well i.e. reading UNIMARC's PDF documentation. At the moment all UNIMARC related tickets are in a preparation state, not yet ready for coding work.
Ok, I also found out that the machine-readable documentation is incomplete. Here is a jq
script to extract Avram-compatible records but post-processing is required to merge fields, indicator codes and subfield schedules anyway. The student can reuse, compare or ignore this piece of code.
for n in 0XX 1XX 2XX 3XX 41X 42X 43X 44X 45X 46X 47X 48X 5XX 60X 61X 62X 66X 67X 7XX 801 702 830 850 856 886; do
curl -s http://iflastandards.info/ns/unimarc/unimarcb/elements/$n.jsonld | jq -f jsonld2avram.jq
done
def remove_nulls: del(..|nulls);
def parse_id: .["@id"]|split("/")[-1];
.["@graph"]
| map(
select(.status.label=="Published") | # only published elements
select(parse_id|.!="") # omit element sets
)
|
.[]
| parse_id as $id
| $id[1:4] as $tag # field
| $id[4:5] as $ind1 # indicator1
| $id[5:6] as $ind2 # indicator2
| $id[6:7] as $code # subfield code
| {
$tag,
indicator1: (if $ind1!="_" then { codes: {($ind1):""} } else null end),
indicator2: (if $ind2!="_" then { codes: {($ind2):""} } else null end),
subfields: {
($code): ({
$code,
label: .label.en,
description: .description.en[0],
url: .url
# .note is ignored as it has no counterpart in Avram
} | remove_nulls)
}
} | remove_nulls
Great, thanks a lot!
The component reads the machine readable UNIMARC schema and creates a schema object, similar to the schema reader that reads the Avram representation of PICA.
Parent: #305