Open karthik opened 11 years ago
I've written a working python client for this web service (https://gist.github.com/4351598) It takes some time to get a response from the website - about 30-60 seconds - so I'm not sure how to integrate it to markx.
The right way would be to convert it to (X)HTML (or DocBook/OpenDocument XML) via Pandoc and then apply a stylesheet to get the desired xml. Converting from PDF will definitively loose information, especially on two-column layouts, even if the application from http://www.scfbm.org/content/7/1/7 is used.
One great thing to enhance scholarly writing would be to convert this to semantic markup. This tool http://pdfx.cs.man.ac.uk/ might be super handy for us because we could first export to PDF, then programmatically convert to xml. I'll leave it here as a placeholder.