pkiraly / qa-catalogue

QA catalogue – a metadata quality assessment tool for library catalogue records (MARC, PICA)
GNU General Public License v3.0
76 stars 18 forks source link

Unimarc schema parsing #384

Closed gegic closed 7 months ago

gegic commented 7 months ago

I picked the changes related to the UNIMARC schema parsing only, and have created a separate branch which this pull request is based on. Therefore, this pull request consists only of changes related to the schema reading and nothing else. There is no validation, completeness, nor TT completeness in here.

The schema has been parsed primarily automatically from the following address: https://archive.ifla.org/VI/8/unimarc-concise-bibliographic-format-2008.pdf. In addition to that, numerous manual modifications were made, mostly in accordance with the newly released PDF with the UNIMARC Manual for 2023, available at https://repository.ifla.org/handle/123456789/2880.

Certain code lists have been taken from their respective appendices of the 2008 version of the UNIMARC Manual.

Due to cherry-picking only the parsing related changes, there are also some changes which might seem unnecessary for now. The following is the explanation for those:

Furthermore, please note that the file main/resource/avram-unimarc.json contains some // remark keys which are meant to denote comments within the avram-unimarc file. I have identified 25 situtations where I had made certain remarks, mostly inconsistencies in the Manual. There are potentially many more such inconsistencies.