Open markmacgillivray opened 12 years ago
To be done either as a parser for inclusion in the repo, or as a an external parser that runs remotely and sends an import to bibserver - either way will be a good example of particular functionality
I've a barebones Perl based parser up as a gist:
https://gist.github.com/1836836
Should accept stdin. JSON seems valid but does not upload to bibsoup. Getting a 'unicode' object has no attribute 'get'. I'm not familar with the JSON module, but am wondering if I need to be more explicit about headers...
Ed, the first record in your JSON output is not a dictionary, but a string.
The BibServer importer was failing here: https://github.com/okfn/bibserver/blob/ecc08d230027a0a3fc2c788f9730bcf9825b92b5/bibserver/importer.py#L163 Trying to assign stuff to a unicode string.
We are improving the parser/importer to give better feedback on these kinds of errors. It should have ideally just failed on that record given feedback and continued. Looking into how to do this in a structured manner.
Thanks. I'll take a look at the blank first line.
Caused by a bad decleration, now fixed.
Fe more tweaks, manual upload of output seems fine, all 953 records imported
Can we add a -bibserver command line switch that outputs: {"display_name": "MARC", "format": "marc", "contact": "Edmund Chamberlain emc59@cam.ac.uk", "bibserver_plugin": true}
The latest version can be found at: https://github.com/okfn/bibserver/blob/master/parserscrapers_plugins/marc2BibJson.pl
This is done, along with a few other tweaks.
What is left to be done to get MARC parser working? @epoz can you let @edchamberlain know what is required? Then we can get the MARC parser available too.
We need to install the Perl MARC modules on the bibsoup server. I mailed Nils about that asking permission, but need to ping him again as I did not receive a reply. On my local machine the MARC parser works.
I added the perl requirement to the ticket re. moving to different server and got no complaints, so we can install on there. The new server by the way is s063. Let me know if you cant login to it
Additional tweaks made to parser code. Should be fairly complete. Currently testing on Harvard data.
MARC parsing will give access to large amounts of library data