LibreCat / Catmandu-SRU

Catmandu module for working with SRU data.
https://metacpan.org/release/Catmandu-SRU
5 stars 5 forks source link

Don't do ForceArray #3

Closed nichtich closed 10 years ago

nichtich commented 10 years ago

ForceArray => [ 'record' ] is meant to modify the SRU envelope but it also modifies every ocurrence of tags named "record" in the payload.

phochste commented 10 years ago

I’ll look at it. Would also be nice to be able to get Catmandu processable MARC out of the SRU (this was a request from Johan some weeks ago).

From: Jakob Voss notifications@github.com<mailto:notifications@github.com> Reply-To: LibreCat/Catmandu-SRU reply@reply.github.com<mailto:reply@reply.github.com> Date: Thursday 20 February 2014 08:46 To: LibreCat/Catmandu-SRU Catmandu-SRU@noreply.github.com<mailto:Catmandu-SRU@noreply.github.com> Subject: [Catmandu-SRU] Don't do ForceArray (#3)

ForceArray => [ 'record' ]https://github.com/LibreCat/Catmandu-SRU/blob/master/lib/Catmandu/Importer/SRU.pm#L60 is meant to modify the SRU envelope but it also modifies every ocurrence of tags named "record" in the payload.

— Reply to this email directly or view it on GitHubhttps://github.com/LibreCat/Catmandu-SRU/issues/3.

phochste commented 10 years ago

In general XML::LibXML::Simple is too simplistic to parse the response from a SRU request. It would be better to: 1) Return the XML content of a recordData as-is 2) Provide separate pluggable ways to decode the XML into Perl 3) Preferably have a MARC decoder that produces a Catmandu MARC model

nichtich commented 10 years ago

recordData may contain XML or a string. In practice the XML contains of a single root element but I am not sure whether this is a must.

A workaround would be to serialize back the XML with XML::Struct::writeXML for further processing. A better solution would be to directly use XML::LibXML::Reader or XML::LibXML::SAX and pass the record content to another module (MARCXML Parser, XML::Simple, XML::Struct etc.)

phochste commented 10 years ago

I'm working on this better solution option now

phochste commented 10 years ago

Could you check commit '98233b7'. It contains code to automatically parse MARC and DC and provides options to plugin your own SRU response parser as needed.

You should be able to do something like:

catmandu convert SRU --base http://www.unicat.be/sru -query dna --recordSchema marcxml --fix "marc_map('245','title');"