LibreCat / Catmandu-MARC

Catmandu modules for working with MARC data
https://metacpan.org/release/Catmandu-MARC
Other
8 stars 10 forks source link

Parsing of MARC XML generates a warning #22

Closed phochste closed 8 years ago

phochste commented 8 years ago

Parsing XML files with MARC::File::XML (1.0.3) generates a warning

Use of uninitialized value in concatenation (.) or string at /opt/lludss-   import/local/lib/perl5/MARC/File/XML.pm line 397, <GEN0> chunk 3.

Execute

$ catmandu convert MARC --type XML < marc.xml

where marc.xml like:

<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<leader>     cam a22     3u 4500</leader>
<controlfield tag="001">530000001</controlfield>
</record>
</collection>
phochste commented 8 years ago

Reported to MARC::File::XML as bug

jorol commented 8 years ago

I can't reproduce the problem:

$ catmandu convert MARC --type XML to JSON < marc.xml
{"record":[["LDR",null,null,"_","     cam a22     3u 4500"], "001",null,null,"_","530000001"]],"_id":"530000001"}

Which version of modules are installed on your system (Catmandu, Catmandu::MARC)?

phochste commented 8 years ago

perl 5.22.0 catmandu (Catmandu::CLI) version 0.9505 (/opt/lludss-import/local/bin/catmandu) Catmandu::MARC 0.213

MARC::Record | 2.0.6 | Perl extension for handling MARC records MARC::Charset | 1.35 | convert MARC-8 encoded strings to UTF-8 MARC::File::MARCMaker | 0.05 | - Work with MARCMaker/MARCBreaker records. MARC::File::MiJ | 0.04 | Read newline-delimited marc-in-json files MARC::File::XML | 1.0.3 | Work with MARC data encoded as XML MARC::Parser::RAW | 0.03 | Parser for ISO 2709 encoded MARC records MARC::Record::MiJ | 0.04 | Convert MARC::Record to/from marc-in-json structure

jorol commented 8 years ago

I have same configuration except:

perl 5.18.2 (Strawberry Perl, Windows 7 32-Bit) Catmandu::MARC 0.214

phochste commented 8 years ago

I have the same on Catmandu::MARC 0.214

jorol commented 8 years ago

Ok, tested again on a Linux machine and can reproduce it there:

Debian 3.2.65-1+deb7u2 x86_64 GNU/Linux perl v5.20.2 built for x86_64-linux

jorol commented 8 years ago

My bugfix was merged https://github.com/perl4lib/marc-perl/pull/3, but no new cpan release yet. As an alternative I implemented a new (faster) parser https://github.com/jorol/MARC-Parser-XML/, but I'm not sure if that would be a good solution for the long term.

njahn82 commented 8 years ago

I got the same problem. I tried to convert INSPIRE HEP Marc-XML Dump into json on the command line to avoid xml event parsing in R.

$ catmandu convert MARC --type XML to JSON < test_hep.xml > test.json
Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/MARC/File/XML.pm line 397, <GEN0> chunk 17.

Any other suggestion to get this job done on the CLI?

njahn82 commented 8 years ago

Upps, should have checked test.json, looks workable, thank you!

phochste commented 8 years ago

:)