edsu / pymarc

process MARC records from Python
http://python.org/pypi/pymarc
Other
253 stars 98 forks source link

Document use of MARCXML with an example #134

Closed nichtich closed 5 years ago

nichtich commented 5 years ago

We are still trying to find out how to parse and serialize MARCXML. I then stumbled upon #73 (make it simpler) but one or two examples in the documentation may be enough to start with:

P.S: See also this help request.

edsu commented 5 years ago

Sure, I'd be happy to merge those docs. Although the referenced ticket, which I've intentionally left open, makes it sound like there could be a preferable API for reading/writing XML?

edsu commented 5 years ago

@nichtich do you know if lxml.etree.iterparse reads the entire document into memory, and then iterates over that?

edsu commented 5 years ago

@nichtich Will this new documentation help clarify the current state of affairs with XML and JSON? It can always get updated if there does turn out to be progress on #73. If you have any suggestions for improving it please send them along.

nichtich commented 5 years ago

@edsu Thanks for the new documentation that's very helpful, I did not know about map_xml. Writing XML is possible nevertheless. The following is possibly no beautiful Python but should work:

# write XML header to file object
file.write('<?xml version="1.0" encoding="UTF-8"?>\n')
file.write('<collection xmlns="http://www.loc.gov/MARC21/slim">\n')

# write records
for record in records:
     file.write(marcxml.record_to_xml(record).decode("utf-8"))
     file.write('\n')

# write XML footer
file.write('</collection>')
file.close()
edsu commented 5 years ago

That would be better expressed in a separate pull request. If the documentation suffices for now I'm going to close this.