sarabander / sicp

HTML5/EPUB3 version of SICP
http://sarabander.github.io/sicp/
Other
4.35k stars 600 forks source link

SVG images don't render if ePub converted to Mobi #11

Closed k4rtik closed 2 years ago

k4rtik commented 9 years ago

I am trying to read the book using my Kindle Paperwhite (which doesn't support ePub), but the converted book just shows SVG in place of each image in the book.

I have tried experimenting a lot with different conversion tools -- Calibre, kindlegen, Kindle Previewer -- but to no avail.

With kindlegen I figured that this is because <object> tag is not supported by it:

Warning(inputpreprocessor):W29007: Rejected unknown tag: <object data="html/fig/coverpage.std.svg" type="image/svg+xml">
      in file: /var/folders/j2/3hbhcfdd2d9b2dmvcyq4g9040000gn/T/mobi-6Ttufx/index.xhtml     line: 0000060

From Amazon Kindle Publishing Guidelines:

3.6.11 Image Guideline #11: Use Supported SVG Tags and Elements

A publisher can reference the SVG files from within an HTML file using inline <svg>, <img>, <embed>, or <object> tags. Please refer to the SVG specification http://www.w3.org/TR/SVG/ for details about SVG.

Example

<html>
<body>
<svg xmlns="http://www.w3.org/2000/svg"><!—Inline SVG--></svg>
<img src="svgfile1.svg"/>
<embed src="svgfile2.svg"/>
<object src="svgfile3.svg"/>
</body>
</html>

According to last example, this problem might just be because of using data instead of src attribute of <object>.

I am no expert but I am experimenting using Sigil to convert the tags to one of these. If I succeed, I will provide a patch.

sarabander commented 9 years ago

Good to hear about your coversion experiments. I quickly tried src in place of data -- image disappeared in Chrome and Firefox. I vaguely remember that the <object data=...> syntax was the only one that worked when I tried to make the image dimensions proportional to body text's font size.

The solution could be a batch-substitution of all the data= attributes to src= before Kindle conversion.

k4rtik commented 9 years ago

Yeah, I think there is a typo in Amazon's documentation. W3C docs also say it's data not src.

<img> tag seems to work fine. I need to test after converting to KF8 format and test on the physical device.

k4rtik commented 9 years ago

It turns out, the problem is not with the type of tags used, it is with the path of images. Apparently conversion programs do not keep the folder structure intact.

Could you please share how do you generate the ePub edition from HTML source available in this repo? Might help me to understand what is going wrong, because when I try to create ePub from source using Sigil, I face the same issue as with conversion to Mobi, all images are lost.

sarabander commented 9 years ago

You could first check that the image files are actually included in the zip archive when you make epub with Sigil or convert to Mobi. If they are, then there's a referencing problem with image links or omissions in the manifest.

In this repo, Makefile shows the epub generation scripts to be called. I will list the essential lines here, without substitution and cleanup tasks, and ignoring all the lines that pertain to Texinfo -> HTML conversion:

DIR = html/
GOAL = ../sicp.epub             # the end product of compilation
NEXUS = $(DIR)index.xhtml       # the central file with table of contents
META = content.opf toc.xhtml    # epub metafiles generated from NEXUS
HTML = $(DIR)*.xhtml            # all the HTML files of the book
FIG = $(DIR)fig/*/*.svg         # SVG diagrams
CSS = $(DIR)css/*.css           # style files
FONT = $(DIR)css/fonts/*        # WOFF fonts
COVER = index.in.xhtml $(DIR)fig/coverpage.std.svg $(DIR)fig/bookwheel.jpg
THUMB = $(DIR)fig/cover.png     # thumbnail cover image

$(META): $(NEXUS) create_metafiles.rb 
    ./create_metafiles.rb

$(THUMB): $(COVER)
    @inkscape -b "#fbfbfb" -C -e $(THUMB) -f $(DIR)fig/coverpage.std.svg > /dev/null

$(GOAL): $(META) $(THUMB) $(FIG) $(CSS) $(FONT) mimetype META-INF/* LICENSE
    zip -0Xq $(GOAL) mimetype; \
    cp index.in.xhtml index.xhtml; \
    zip -Xr9Dq $(GOAL) $(META) $(HTML) META-INF/* LICENSE \
      index.xhtml $(DIR)css/* $(DIR)fig/*

create_metafiles.rb creates content.opf and toc.xhtm by traversing the nexus. The manifest in content.opf lists all the files that an epub reader or converter should be aware of, including SVG images. If some html file contains any links to images, then it is indicated by "svg" property. Same with MathML.

The lines with zip first add mimetype as an uncompressed file (-0), then compress (-9) all the other files recursively (-r), without extra file and directory attributes (-XD), keeping quiet about it (-q).

Any inconsistencies in the epub file will be reported by epubcheck.

The full conversion process is fairly complicated in the Makefile due to replacing some artefacts and fixing other issues. But the epub file creation is rather straightforward in the end. I hope this has cleared things a bit. Looking forward to hear more about your Kindle conversion struggles :-)

k4rtik commented 9 years ago

Thanks, this should help a lot. I will try again when I reach home.

k4rtik commented 9 years ago

You could first check that the image files are actually included in the zip archive when you make epub with Sigil or convert to Mobi. If they are, then there's a referencing problem with image links or omissions in the manifest.

It seems they are, as I see the rasterization messages like the following in Calibre job details (Mobi/KF8 format):

Rasterizing SVG images...
Rasterizing u'html/fig/icons/cc.std.svg' to 57x57
Rasterizing u'html/fig/icons/by.std.svg' to 57x57
Rasterizing u'html/fig/icons/nc.std.svg' to 57x57
Rasterizing u'html/fig/icons/sa.std.svg' to 57x57

Also, the output mobi size is quite huge (~17 MB), compared to the ePub, so we could be reasonably sure that images are being included.

But I also see lot of the following errors for the same job:

Could not add child element to parent element because the types are incorrect.

There are total 103 such errors and 89 lines of output for rasterization.

Unsure where we stand yet as the error seems generic (and I can't hack open the mobi file), but there is some progress. :)

Pasted complete log here, if you would like to take a look.

sarabander commented 9 years ago

Yes, the error message is cryptic. Something weird happens between converting figures 3.20 and 3.21. The errors jump out in the middle of outputting "528x528", as if it's coming from another thread. And why are these images reported as squares? They are clearly rectangular. Very strange...

k4rtik commented 9 years ago

No, that's coming from another thread. It breaks at different places each time I convert.

Another concern is why rasterize in first place, KF8 supports SVG. I think I will need to report this to Calibre dev.

You had mentioned there could be "a referencing problem with image links or omissions in the manifest", what did you suspect?

sarabander commented 9 years ago

Sure, it would be good to avoid the conversion to bitmaps, these are big and ugly.

The referencing problem is just a speculation, I don't know the Mobi or KF8 standards. Do they even have something resembling an item manifest? What paths are the image links using? It would help to find a way to peek inside the container files.

k4rtik commented 9 years ago

I think I have partially figured out the problem.

When I open the primary sicp.epub in Sigil, I see that it tries to gather all images in a single directory called 'Images', leaving all href's to SVG's broken. When I correct the paths, I get the right ePub generated using Sigil.

But rest of the problem remains the same -- I use this secondary ePub as source with either of kindlegen or Calibre, the generated Mobi still shows 'SVG' text instead of images. :(

I tried looking at strings sicp.mobi, I see correct paths similar to 'Images/Fig1.2.std.svg'. Unsure what goes wrong during the conversion.

tuxu commented 7 years ago

I had the same problem and finally decided to convert all equations to SVG (first using Mathjax, then using LaTeX + dvisvgm for better quality). These are embedded with proper baseline alignment and em-based scaling, so they scale with the font size.

Also, Kindle doesn't understand SVGs embedded in object tags but renders them just fine as imgs. I made a bunch more changes for e-ink readers, such as increased the contrast, and put the version up in this branch: sicp/svgmath. Compiled EPUB3 and Mobi versions here.

segmentationfaulter commented 7 years ago

I am not applying any conversion and trying to read the book using Google Play Books app, but svg's are not rendering and I only see the text SVG as a placeholder.

ghost commented 7 years ago

@tuxu Wow, reeeeealy appreciate it, man. I've been trying to convert the epub to a lot of things, tried a lot of stuff inside the HTML but nothing worked. I took your epub version, converted it to mobi and everything appears perfectly on my Kindle Paperwhite 3. Thank you very much for work!

tuxu commented 7 years ago

@Teo-ti Thank you, I'm glad you like it! :) Actually, there's a Mobi version already prepared for the Paperwhite and others.

k4rtik commented 2 years ago

Kindle is moving away from supporting the Mobi format and supporting ePub instead, closing.