Open axfelix opened 6 years ago
Am guessing it's an edge case around
https://github.com/MartinPaulEve/meTypeset/blob/master/bin/captionclassifier.py#L193
but not too sure what's happening here...
Thanks for this, Alex -- and for the minimal test case.
I'll take a look at the weekend!
M
On 16/01/18 20:42, axfelix wrote:
Getting invalid JATS, with plaintext that should be wrapped in a caption element, as the value of graphic, as below:
|
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image1.jpeg" position="float" orientation="portrait" xlink:type="simple"/>Fig. 3. The structure of a multidimensional control system for ceramsite burning: EM – an electromechanical part; <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image2.wmf" position="float" orientation="portrait" xlink:type="simple"/>– a vector specifying exposure; D – a temperature sensor |From this doc: 1339-5501-1-LE.docx https://github.com/MartinPaulEve/meTypeset/files/1636721/1339-5501-1-LE.docx
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MartinPaulEve/meTypeset/issues/115, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_ot3caZZaC419kHv4TbD-ZB7bggHBxks5tLQmkgaJpZM4RgZWy.
-- Professor Martin Paul Eve Chair of Literature, Technology and Publishing Birkbeck, University of London
T: 0203 073 8420 E: martin.eve@bbk.ac.uk W: https://www.martineve.com R: 416, 43 Gordon Square, London, WC1H 0PD
Books: https://www.martineve.com/books/ Articles: https://www.martineve.com/c-v/
Series Editor: New Horizons in Contemporary Writing (Bloomsbury) Director, Birkbeck Centre for Technology and Publishing Founder, Open Library of the Humanities (https://www.openlibhums.org) Chief Editor, Orbit (https://www.pynchon.net) Senior Online Editor, Alluvium, (http://www.alluvium-journal.org)
Hi Alex,
OK, so I've done some investigation of the problem here and have got this far:
<fig position="float" orientation="portrait"><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image1.jpeg" id="IDd73b995a-a3f3-4940-9d03-e8db274d85f9" position="float" orientation="portrait" xlink:type="simple"><label>Fig</label><caption><p>3 The structure of a multidimensional control system for ceramsite burning: EM – an electromechanical part;</p></caption></graphic><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image2.png" position="float" orientation="portrait" xlink:type="simple"/>– a vector specifying exposure; D – a temperature sensor</fig>
The problem here is that the caption contains an image. So, unfortunately, the caption is split into two tail blocks across two different elements.
I'm not really sure that we can fix this; are images even allowed in image captions?
Any thoughts welcome.
Oh boy. It looks like there are technically valid ways to include rich media in captions (either through inline-graphic
or alternatives
, but ... it's not clear that's the intended behaviour in this or in any other case we'll see.
I'd be tempted to just insert </fig><fig>
in the middle of any time we see </graphic><graphic>
to be honest...
another example, should be slightly less problematic to fix
(not sure why we're seeing more of these lately)
Getting invalid JATS, with plaintext that should be wrapped in a caption element, as the value of graphic, as below:
<fig position="float" orientation="portrait"><graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image1.jpeg" position="float" orientation="portrait" xlink:type="simple"/>Fig. 3. The structure of a multidimensional control system for ceramsite burning: EM – an electromechanical part; <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="media/image2.wmf" position="float" orientation="portrait" xlink:type="simple"/>– a vector specifying exposure; D – a temperature sensor</fig>
From this doc: 1339-5501-1-LE.docx