wpoa / open-access-media-importer

A tool for harvesting media files from Open Access articles for upload into Wikimedia Commons
http://commons.wikimedia.org/wiki/User:Open_Access_Media_Importer_Bot
23 stars 8 forks source link

Prepare slides for JATS-Con talk #98

Closed Daniel-Mietchen closed 10 years ago

Daniel-Mietchen commented 10 years ago

Let's collect directly OAMI-related information for the talk here, for easy inclusion in http://chrismaloney.org/notes/OAMI%20JatsCon%20slides,%202013 .

Daniel-Mietchen commented 10 years ago

Slides are being collected at https://en.wikipedia.org/wiki/User:Daniel_Mietchen/Talks/JATS-Con_Impromptu_2013 .

Daniel-Mietchen commented 10 years ago

More XML inconsistencies, observed by others: https://twitter.com/alexsdutton/status/448749970322358274 .

alexdutton commented 10 years ago

The XSL we used to parse the JATS XML into something easier to work with for pulling out citations is on GitHub. In particular, lines 530 f.f. are the bit that deals with implicit reference ranges. I should warn you that I wrote that XSL in my early days ;-)

Klortho commented 10 years ago

Quoting from @alexsdutton 's tweet:

either 1–3 (good), or 13 (bad)

Note, ref should be rid. Interestingly, the PMC tagging guidelines specify what you're calling the "bad" style.

We're aware there's a lot of variation in the way ranges of references are tagged in the article body, and it is a huge problem for us as well. Not sure what can be done about it, though.

Daniel-Mietchen commented 10 years ago

The slides are up at https://en.wikipedia.org/wiki/User:Daniel_Mietchen/Talks/JATS-Con_2014/Inconsistent_XML_as_a_Barrier_to_Reuse_of_Open_Access_Content and the video recording sits at http://videocast.nih.gov/summary.asp?Live=13961&start=15090&bhcp=1 . The talk was featured prominently in Jeff Beck's lightning talk on Day 2: http://videocast.nih.gov/summary.asp?Live=13963&start=11980&bhcp=1 .