stencila / encoda

↔️ A format converter for Stencila documents
https://stencila.github.io/encoda/
Apache License 2.0
35 stars 9 forks source link

ImageObject: contentUrl and `.media` path #966

Open rgieseke opened 3 years ago

rgieseke commented 3 years ago

After upgrading to 0.117.4 the contentUrl paths of ImageObjects don't contain the sibling path anymore.

They are still copied to the sibling folder (https://github.com/stencila/encoda/blob/0a0ed52d04fa62b26eb2362098bb6d6f37f1872e/src/index.ts#L350). Is this change on purpose (one could always map this i guess)?

rgieseke commented 3 years ago

I was wrong, the file is not copied anymore.

E.g. a JATS XML file

<?xml version="1.0"?>
<article>
<front>
<article-meta>
<contrib-group />
</article-meta>
</front>
<body>
<fig id="fig1">
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="test.jpg"
id="g1" mimetype="image/jpeg" />
</fig>
</body>
<back>
</back>
</article>

Running

node dist convert test/test.xml --from jats test/out/test.json

yields the following structure: test ├── out │ └── test.json ├── test.jpg └── test.xml

With 0.115.4 running

node dist/cli convert test/test.xml --from jats test/out/test.json

gave

test ├── out │   ├── test.json │   └── test.json.media │   └── test.jpg ├── test.jpg └── test.xml

alex-ketch commented 3 years ago

Thanks once again for the bug report and the investigative notes @rgieseke, very much appreciated 💖

After looking into this myself, it looks like Encoda's convert function isn't being called when invoked via the CLI, instead it only defers conversion to Jesta. Jesta on the other doesn't know how to handle conversion :(

Our unit tests didn't catch the regression because they mostly bypass this codepath, and instead test the individual codec's convert function.

Unfortunately, I think the best bet for you right now is to downgrade to v0.116.1 while we fix the issues (and add tests to make sure this doesn't happen in the future).

Meanwhile let me know if I can help with anything else!

rgieseke commented 3 years ago

Thanks for the explanation! Not sure whether there are other places where this might be tested but in the JATS tests contentUrl was removed here: https://github.com/stencila/encoda/commit/227e1006c0232217f5e71bcc4f5229a3baef84a6#diff-536d58ef076a53c2201d64029419727b9318c2408b7d78dbbf4168bcbb98b3f1

Not sure how exactly the tmp paths there were handled, maybe it's not a problem anymore and are relative now?