pkiraly / qa-catalogue

QA catalogue – a metadata quality assessment tool for library catalogue records (MARC, PICA)
GNU General Public License v3.0
78 stars 17 forks source link

Adding XML serialization for PICA #232

Open pkiraly opened 1 year ago

pkiraly commented 1 year ago

Todo

nichtich commented 1 year ago

To reduce code complexity I'd prefer to not implement this, unless there is a third-party PICA-library to be used with a few lines of code only.

pkiraly commented 1 year ago

It was implemented because Koninklijke Bibliotheek (the Dutch national library) PICA records are available in XML format (in an OAI-PMH service). The implementation was not very complicated, and fits into the row of other sericalization handlings. The basic idea is that these readers only transform fields, subfields etc. to a unified Java representation, and all the analysis is happened on that. So they are thin classes.

BTW: On the long run I would like to move these reader classes to the marc4j library.

nichtich commented 1 year ago

Do you plan to move PICA readers to marc4j or to specific PICA library (e.g. "pica4j")? This could bundle most (but not all) of package de.gwdg.metadataqa.marc.utils.pica, plus class PicaPathSelector and possibly parts of de.gwdg.metadataqa.marc.cli.utils.ignorablerecords. By the way, there is also https://github.com/metafacture/metafacture-core/tree/master/metafacture-biblio/src/main/java/org/metafacture/biblio/pica - might be worth to consider joining forces.

Combining both marc and pica functionality, however, may also make sense, so we need could merge some Pica and Marc classes.

pkiraly commented 1 year ago

marc4j does not have any semantic layer, so I plan to move those parts which handles purely the basic structure, such as the readers. PicaPathSelector and MarcSpec implementation: this is also something which is missing from marc4j, first we have to discuss with the contributors of that tool if they would like it.

Some classes are already merged and there is inheritance between them, but more is needed.

Thanks for the reference to Metafacture, I was not aware of their PICA efforts. Do you have a connection to the creators?