Open gypsyjoe opened 11 years ago
Ugh! The XML in my text looks like crap on this page. If whomever is available to help with this will email me (jhjusti@sandia.gov), I'll email you the XML causing the blow up. (I've been burning weeks on this problem and it continues to stymie me.)
Actually, if you can see the saved text by selecting to edit this issue, it seems to have saved the XML text. But if it will help, I'm happy to email it to whomever attempts to analyze it.
Joe Justice Sandia National Laboratories Albuqueruque, New Mexico
Can you submit this as an issue under marc4j, with the binary and xml versions
Yes. I guess I was in the wrong place. Sorry. ☺
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 11:55 AM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Can you submit this as an issue under marc4j, with the binary and xml versions
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14250219.
Also, if you have stacktrace that is good too, but sample code is good too
Simon
On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe notifications@github.com wrote:
Yes. I guess I was in the wrong place. Sorry. ☺
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 11:55 AM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Can you submit this as an issue under marc4j, with the binary and xml versions
— Reply to this email directly or view it on GitHub< https://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14250219>.
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14251063 .
I will send you the process I’m working through because I have several steps that are going on getting me to this point. I can include the original binary MARC21 from which this MARCXML is coming, but, as I cannot convert the MARCXML I sent you, I cannot send you any binary MARC of that step. I’ll do my best to describe what’s going on in my item. But I am able to convert some records that are not included here. I’ll include those files, too, and describe them.
I should have it ready soon. Thanks.
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 12:11 PM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Also, if you have stacktrace that is good too, but sample code is good too
Simon
On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe notifications@github.com<mailto:notifications@github.com> wrote:
Yes. I guess I was in the wrong place. Sorry. ☺
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 11:55 AM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Can you submit this as an issue under marc4j, with the binary and xml versions
— Reply to this email directly or view it on GitHub< https://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14250219>.
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14251063 .
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14251202.
How do I attach the files to the issue? It’s issue #26. Here’s the zip of the files I wanted to attach. But I can’t figure out who to do it on the site.
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 12:11 PM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Also, if you have stacktrace that is good too, but sample code is good too
Simon
On Thu, Feb 28, 2013 at 2:08 PM, gypsyjoe notifications@github.com<mailto:notifications@github.com> wrote:
Yes. I guess I was in the wrong place. Sorry. ☺
-joe
From: Simon Spero [mailto:notifications@github.com] Sent: Thursday, February 28, 2013 11:55 AM To: solrmarc/stanford-solr-marc Cc: Justice II, Joe H. Subject: [EXTERNAL] Re: [stanford-solr-marc] MarcXmlParser XMLReader parse error when converting from MARCXML back to MARC21 (#1)
Can you submit this as an issue under marc4j, with the binary and xml versions
— Reply to this email directly or view it on GitHub< https://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14250219>.
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14251063 .
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14251202.
Looking at the MARCXML record above the field you add:
<datafield tag="088">
<subfield code="a">OSTI_ID=1095410</subfield>
</datafield>
is missing the marc indicators, (the ind1 and ind2 attributes)
if you change the added datafield to be:
<datafield ind1=" " ind2=" " tag="088">
<subfield code="a">OSTI_ID=1095410</subfield>
</datafield>
it should parse correctly and produce a valid marc8 encoded binary MARc record after conversion.
Still ought to be handled more gracefully than an NPE.
I was about to split the Reader and Writer Tests on a per class basis, so this is good excuse.
Simon
On Sat, Mar 2, 2013 at 3:42 PM, haschart notifications@github.com wrote:
Looking at the MARCXML record above the field you add:
OSTI_ID=1095410 is missing the marc indicators, (the ind1 and ind2 attributes)
if you change the added datafield to be:
<datafield ind1=" " ind2=" " tag="088"> <subfield code="a">OSTI_ID=1095410</subfield> </datafield>
it should parse correctly and produce a valid marc8 encoded binary MARc record after conversion.
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14335296 .
Cool! Let me know if I may be of help or if you have any questions. I'm sure I could forward the DOM code showing how I'm doing things there.
Honestly, I've been banging at this since before Code4Lib and it has been through all sorts of rewrites and attempts to comb out the problem. My latest thought is to pull in the marc4j project code into my servlet code so I can step through the marc4j processes and examine them more completely. But I wasn't able to finish this set up on Friday.
Good luck. I'm burning a candle for us. :-)
-joe
Sent from my iPad
On Mar 2, 2013, at 3:49 PM, "Simon Spero" notifications@github.com<mailto:notifications@github.com> wrote:
Still ought to be handled more gracefully than an NPE.
I was about to split the Reader and Writer Tests on a per class basis, so this is good excuse.
Simon
On Sat, Mar 2, 2013 at 3:42 PM, haschart notifications@github.com<mailto:notifications@github.com> wrote:
Looking at the MARCXML record above the field you add:
OSTI_ID=1095410 is missing the marc indicators, (the ind1 and ind2 attributes)
if you change the added datafield to be:
<datafield ind1=" " ind2=" " tag="088"> <subfield code="a">OSTI_ID=1095410</subfield> </datafield>
it should parse correctly and produce a valid marc8 encoded binary MARc record after conversion.
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14335296 .
— Reply to this email directly or view it on GitHubhttps://github.com/solrmarc/stanford-solr-marc/issues/1#issuecomment-14337609.
I am attempting to use marc4j to convert a MARCXML file back to MARC21 binary, which I had previously converted from MARC21 to MARCXML using marc4j. I made one update to some of the records in the MARCXML to add a single tag element for MARC tag 088 with a value of "OSTI-ID=#######" where the #'s are individual numeric digits. After making this update and then attempting to convert back to MARC21, I get a snag in the SAXParser that throws a NullPointerException. It breaks on a particular record.
I've attempted to fix this by pulling out the individual records into a DOM and getting each node then pulling the string out of the node and then converting the string to a byte array input stream to move it to an InputStream object and passing the single record to the MarcXmlReader object. But I get the following error for this record megta data.
Exception getting thrown: MarcXmlParser run() MarcException: Unable to parse input
XML Record causing the blow up:
MarcXmlParser run() MarcException: Unable to parse input I would greatly appreciate it if someone could help me figure out why this record XML is flipping out the MarcXmlParser.parse function. It seems to be blowing up when the SAXParserFactory XMLReader attempts to parse the record. I'm even passing the node string through a normalizer like this to make sure it's valid ASCII text.
szxmlnode = Normalizer.normalize(szxmlnode, Normalizer.Form.NFD).replaceAll("[^\p{ASCII}]", "");
Joe Justice Sandia National Laboratories Albuqueruque, New Mexico