pkiraly / qa-catalogue

QA catalogue – a metadata quality assessment tool for library catalogue records (MARC, PICA)
GNU General Public License v3.0
77 stars 17 forks source link

PICA: Extend classification/subject headings schemes from config file #190

Open nichtich opened 1 year ago

nichtich commented 1 year ago

The list of classification/subject headings schemes seems to be hard-coded at several places:

As discussed in #154 there are more of these fields.

Maybe better not hardcode each field but get it from the Avram schema?

Current list of subject indexing fields from PICA+ K10plus Avram schema via

curl https://format.k10plus.de/avram.pl?profile=k10plus-title | jq -r '.fields[]|select(.tag|match("04[45]|041A"))|[.tag,.occurrence,.label]|@tsv' -r

The following fields are also subject indexing but only on level 1, so not needed so far:

nichtich commented 1 year ago

For current development, hard-coding the fields is ok.

pkiraly commented 10 months ago

@nichtich I am working on this issue and I found that we have a vocabularies.json file with the similar purpose (https://github.com/pkiraly/qa-catalogue/blob/0162996dea4662c02589c83195ce0a61162ba1cc/src/main/resources/pica/vocabularies.json). There are two main differences between JSON and TSV files:

The result is depending on which file we use as a starting point. If we choose the JSON, it should be extended with those fields that are available in the TSV.

What do you think?