PICA: Extend classification/subject headings schemes from config file

nichtich commented 1 year ago

The list of classification/subject headings schemes seems to be hard-coded at several places:

As discussed in #154 there are more of these fields.

041A keywords
044. all subject indexing fields starting with 044
045. all subject indexing fields starting with 045

Maybe better not hardcode each field but get it from the Avram schema?

Current list of subject indexing fields from PICA+ K10plus Avram schema via

curl https://format.k10plus.de/avram.pl?profile=k10plus-title | jq -r '.fields[]|select(.tag|match("04[45]|041A"))|[.tag,.occurrence,.label]|@tsv' -r

The following fields are also subject indexing but only on level 1, so not needed so far:

144Z local library keywords
145S local library classification
145Z local library classification

nichtich commented 1 year ago

For current development, hard-coding the fields is ok.

pkiraly commented 10 months ago

@nichtich I am working on this issue and I found that we have a vocabularies.json file with the similar purpose (https://github.com/pkiraly/qa-catalogue/blob/0162996dea4662c02589c83195ce0a61162ba1cc/src/main/resources/pica/vocabularies.json). There are two main differences between JSON and TSV files:

the JSON file provides a number of properties for each fields, while the TSV does only the title
the JSON file have 15 fields, while the TSV does 40.

The result is depending on which file we use as a starting point. If we choose the JSON, it should be extended with those fields that are available in the TSV.

What do you think?

pkiraly / qa-catalogue

PICA: Extend classification/subject headings schemes from config file #190