Open nichtich opened 9 years ago
I understand the underscore '_' contract to name the fields that are purely from the protocol with a naming convention that doesn't clash with the data payload.
OAI , SRU, Atom, RSS, etc are other protocols with other fields and semantic meaning to these fields.
It's probably too late for consistent naming and each format has its peculiarities but some common fields exist (date, identifier, categories/sets, content). Mapping could also be done with fixes so the source does not need to be modified. OAI and SRU, however are very similar as the internal record format can be chosen.
Right now at CPAN there are:
I'd like to make both compatible so we only need one module for each record format, for instance in the Catmandu::XML::Parser::
namespace.
In Catmandu-OAI processing of records is done by a handler
that receives a XML::LibXML::Element or a XML::LibXML::Document and returns a hash reference. In Catmandu-SRU the handler/parser receives a record with serialized XML in field recordData
(duplicated parsing!). To unify there must be one preferred variant and I favour the way of Catmandu-OAI.
P.S: See https://github.com/LibreCat/Catmandu-XML/commit/de087451eed942c28ca5cd76b4bb700321526ef8 for a first draft to refactor code from Catmandu::Importer::SRU and Catmandu::Importer::OAI to Catmandu::XML::Parser.
_id
,_identifier
,_datestamp
,_status
,_setSpec
,_about
and_metadata
.recordSchema
,recordPackaging
,recordPosition
, andrecordData
._metadata
corresponds withrecordData
andrecordSchema
could also be added in Catmandu::OAI. A fix/handler/option to get_metadata
in both importers? How do other importers name similar fields?