data URIs are not included in the results

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Run HTML-to-EPUB3 with "Accessible Publishing - Best Practice Guidelines for 
Publishers" as input 
(http://www.editeur.org/files/Collaborations/Accessibility/WIPO_v3.html)

What is the expected output? What do you see instead?
This image appears in the input document:
<img id="editeur_logo" style="width: 13em; margin-top: 3.9em" alt="logo of 
EDItEUR" src="data:image/gif;base64,R0...ADs="/>
In the output document, this is the result:
<span id="editeur_logo" style="width: 13em; margin-top: 3.9em">logo of 
EDItEUR</span>

Since it was replaced with a span element, I assume this is an issue with 
html-to-epub3.

We may also want to implement some utility steps for data URIs. For instance:
 * px:html-extract-data-uris - inputs: in-memory.in and fileset.in - outputs: in-memory.out and fileset.out - all data uris in the html fileset would be moved to in-memory binary documents and the hrefs updated accordingly (would modify both html and css files)
 * px:html-inline-data-uris - the reverse of px:html-extract-data-uris - might be useful for scripts where HTML is the output?
 * px:file-data-uri-as-document - options: data-uri - outputs: result - would create an XML-representation of the file in the data URI

Original issue reported on code.google.com by josteinaj@gmail.com on 17 Jul 2013 at 11:47

GoogleCodeExporter commented 9 years ago

See issue 348:
https://code.google.com/p/daisy-pipeline/issues/detail?id=348

This is now supported.

Can you cross-check with the latest sources and mark the issue as invalid if 
you're happy ?

Original comment by rdeltour@gmail.com on 17 Jul 2013 at 12:03

Changed state: Reviewed

GoogleCodeExporter commented 9 years ago

It does not work with the latest sources. The error is:

2013-07-17 14:09:37,730 [INFO ] com.xmlcalabash.runtime.XAtomicStep - 
bundle://24.0:1/xml/xproc/html-to-epub3.convert.xpl:49:21:[WARNING] The type of 
image 'data:image/jpg;base64,/9j/4AAQS(...)' is not a core EPUB media type. 
Replacing by alternative text.

Are we sure that all EPUB readers support data URIs or should there be a 
compatibility mode which stores all of them as separate files?

Original comment by josteinaj@gmail.com on 17 Jul 2013 at 12:10

GoogleCodeExporter commented 9 years ago

Mmm. I just tried from oXygen and it worked. I'll try from the dev-launcher to 
confirm.

Not all readers support it, but it's valid EPUB. For now I suggest to leave it 
as-is, since this is legal. It's a good idea to add a "compatibility layer", or 
maybe integrated in the long-planned "EPUB Fixer" script.

Original comment by rdeltour@gmail.com on 17 Jul 2013 at 12:22

GoogleCodeExporter commented 9 years ago

Ok, sure.

Yeah I used the dev-launcher and the webui to test it.

Do we have a list somewhere of things to add to the EPUB Fixer script?

Original comment by josteinaj@gmail.com on 17 Jul 2013 at 12:26

GoogleCodeExporter commented 9 years ago

Mmm, I tried with the dev-launcher and it first didn't work, then re-mvn 
installed the html-to-epub3 script and it worked.
You may need to clean Felix cache.

I don't think we have a list for the EPUB Fixer.

Original comment by rdeltour@gmail.com on 17 Jul 2013 at 12:31

GoogleCodeExporter commented 9 years ago

Added html-to-epub3 module to the scripts pom.xml.

Script works fine now.

If we need data URI utility steps in the future we can open separate issues.

Thanks for the help.

Original comment by josteinaj@gmail.com on 17 Jul 2013 at 1:55

Changed state: Invalid

fatty- / daisy-pipeline

data URIs are not included in the results #353