Open Melissa37 opened 5 years ago
@Maelplaine I cannot assign this ticket to anyone - it should be assigned to Fred and Me.
Linked to #200
Thanks!
PMC requires .tiff format files "Uncompressed high-resolution TIFF or EPS files are required for all images."
Examples of crosslink tagging in the system do not include a file type suffix:
eg:
<graphic id="gra2" xlink:href="pnas.1207965110fig01"/>
HighWire: HWX Specification (tiff format) Color images – 300 dpi, Gray scale images – 600 dpi, Line art – 1200 dpi
Examples of crosslink tagging in the system do not include a file type suffix:
eg:
<graphic id="gra2" xlink:href="pnas.1207965110fig01"/>
Could just be that the filename doesn't have the extension? Reading https://www.ncbi.nlm.nih.gov/pmc/pmcdoc/tagging-guidelines/article/tags.html#el-graphic it says "The name of the file"; on other elements it's "Include the full filename, including file extension, in the @xlink:href
value".
Except:
<related-article>
[...] When specifying a DOI, tag the DOI value in@xlink:href
and specify@ext-link-type="doi"
.
So that's where that came from. Which is crazy, as it's violating the XLink spec?
So that's where that came from. Which is crazy, as it's violating the XLink spec?
I wonder whether it's because a publisher can keep one XML source of truth if they don't add the file extension to figures. We have to send .tiffs to everyone, or eps...but internal systems then convert them to .jpegs to display on the web.
I wonder why we cannot allow non-file extension figure file references in Libero when every other system we've worked with has not had a problem with this? Also, publishers will need to output as .tiff probably from production so you won't get input with .jpeg extensions as that's not what Libero is going to get...
You cannot download XML from the PMC site directly, but it would be interesting if someone has access to the corpus via the API to see whether PMC add file extensions to the XML they "process".
I would suggest if you add .jpeg to the XML, this needs to be throwaway XML only for the purpose of the site, which is not then delivered anywhere else or used as the archive version - the "source of truth" needs to be exactly what the publisher sent. This is because the Libero display/internal needs do not match all requirements in a full workflow
Hindawi XML might clarify this a little: JCNC_6826984_Final_1
It looks like they include various formats of the same image (eps, jpg, svg) in the folder, and refer to it without filename extension (e.g. xlink:href="6826984.fig.002
), so that actual file being used is deliberately ambiguous, and the extension is probably picked out depending on how the XML is transformed - presumably in the Hindawi case the eps is used for the PDF, the jpg for online, and SVG is included for archival purposes.
Problem / Motivation
Libero Publisher requires
xref
links to figure files within the XML coming from a publisher to contain .jpeg file extensions in order to load the content to the siteProposed solution
Cannot be decided until more investigation happens
Tasks
Production to investigate
Clarification needed and assumptions
Technical notes
User interface / Wireframes
@BlueReZZ @thewilkybarkid @GiancarloFusiello @FAtherden-eLife