PreTeXtBook / pretext

PreTeXt: an authoring and publishing system for scholarly documents
https://pretextbook.org
Other
254 stars 203 forks source link

EPUB: automate cover image #1599

Closed Alex-Jordan closed 2 years ago

Alex-Jordan commented 2 years ago

If you are making EPUB and the publication file has a cover image, no problem.

If it does not have a cover image, now that is also not a problem. With this commit, pretext/pretext will recognize there is no cover image. Then it makes the default KOMA-Script title page as tex -> pdf -> png, and the png goes in as the cover image.

Cover image will display title, subtitle (if present), and list all authors.

Alex-Jordan commented 2 years ago

I just noticed this does not handle math in the title or subtitle. The "m" template in -epub is getting in the way, preventing any math in the book title to be written in the usual way for the KOMA-script title page. Any tips on a good way to deal with that?

Alex-Jordan commented 2 years ago

I may just make a special template for the title and subtitle just for this purpose of making the cover image.

rbeezer commented 2 years ago

I really like the automatically-generated cover page that includes book-specific info!

However, plenty of folks say they don't care about print, and so as use broadens we may not want to assume everybody has LaTeX. I'd rather not have a newcomer try to build EPUB and get "pdflatex not found" as an error message. A key principle is that folks can start quickly and get "good" output with little messing around and as few prerequisites as possible, and it is a bonus if they can see what they want to change and figure out how.

Can we

Alex-Jordan commented 2 years ago

Roger that. I was having difficulty anyway getting support for all of the legal children of "title" as latex-appropriate output with all the html/epub templates.

I can still make a cover with book-specific info that does not use LaTeX. I can get python to make a png that shows the title, subtitle, and author. It will not be as pretty as the curated KOMA title page of course. And it will probably only have text() children of the title; math and other things in a title would be hard.

So if it's between a generic simple image and an imperfect png like described above, do you still prefer the generic simple image?

rbeezer commented 2 years ago

So if it's between a generic simple image and an imperfect png like described above, do you still prefer the generic simple image?

I prefer the no-LaTeX-prerequisite. Math (and other stuff) in a book title seems like a bad idea to me. Not that it doesn't happen. But for a no-effort cover, or for a newbie, something possibly imperfect, but totally easy/transparent, seems the best route.

You could save the automatic KOMA-Script cover for another PR if you'd like to get this one out of your hair.

I'm curious to see Python make a PNG cover!

Alex-Jordan commented 2 years ago

Force pushed.

Now when the publisher does not provide a cover image file, python/python creates one as a png image. It has title, subtitle, author. Some aspects are less than ideal, as described in the comment block where the image is being constructed.

Here is an example:

https://spot.pcc.edu/~ajordan/temp/supplement111-112.epub

One possible concern is that I needed to pick fonts and I went with "Arial.ttf" and "Arial Bold.ttf". I couldn't tell where to look for fonts that I could be pretty sure are present by defaults on all/most systems.

rbeezer commented 2 years ago

:-(

PTX: Generating cover image
Traceback (most recent call last):
  File "/home/rob/mathbook/mathbook/pretext/pretext", line 438, in <module>
    main()
  File "/home/rob/mathbook/mathbook/pretext/pretext", line 410, in main
    ptx.epub(xml_source, publisher_file, out_file, dest_dir, 'svg', stringparams)
  File "/home/rob/mathbook/mathbook/pretext/pretext.py", line 1544, in epub
    title_font = ImageFont.truetype("Arial Bold.ttf", title_size)
  File "/usr/lib/python3/dist-packages/PIL/ImageFont.py", line 642, in truetype
    return freetype(font)
  File "/usr/lib/python3/dist-packages/PIL/ImageFont.py", line 639, in freetype
    return FreeTypeFont(font, size, index, encoding, layout_engine)
  File "/usr/lib/python3/dist-packages/PIL/ImageFont.py", line 187, in __init__
    self.font = core.getfont(
OSError: cannot open resource

Is there some more generic approach?

rbeezer commented 2 years ago

No errors with

ImageFont.load_default()

hacked in, but I also got a blank cover. Maybe there are some hints here: https://www.geeksforgeeks.org/python-pil-imagefont-load_default/

Alex-Jordan commented 2 years ago

That's the error I got when it couldn't find a font. So it can't find "Arial Bold.ttf" on your system.

The problem with load_default() is that as far as I could tell, you have no control over font size. Using the 1600x2560 png that we are using, it comes out too small. It's only that big because of something that I saw in the guide. Maybe a smaller image with the same ratio will work. Or maybe it's possible to identify a font that is findable by default on more systems.

Alex-Jordan commented 2 years ago

Chasing rabbits.

I was wrong to say load_default() makes it come out too small. This issue is that font is a bitmap font and the text we send to print really has to be a str. But title, etc., are of type 'lxml.etree._ElementUnicodeResult', not apparently automatically converted to str. OK, so I added code to convert them to str. But this breaks with the local PCC project, because there is an en dash in the title, and it seems that is not covered by the default bitmap font.

So I change to use title-filesafe. I have to drop subtitle, at least for now, since there is nothing right now that will smash that into filesafe too. And I'm pretty sure this will fail for an author name that has accents. Here is how the book looks in the Apple Books library. (The text isn't actually blurry, it's just a low res zoomed out shot.)

Screen Shot 2021-09-02 at 10 08 16 PM

So the text is way too small. I reduce the image dimensions to 450x720 and it looks like:

Screen Shot 2021-09-02 at 10 16 26 PM

(If I reduce much more then the repeated cover image on the real "cover page", page 2, starts to take up less space than the full page.)

So...

(a) use this really bad image that will fail for some users (b) rely on some font being always available and findable (c) have the publisher declare a font (with path) in the absence of their own cover image (d) back to the idea of a generic "PreTeXt EPUB cover image"

Alex-Jordan commented 2 years ago

Or maybe (e) is best: just omit the cover image entirely. If Apple Books is representative, an e-reader will see title and author metadata and create its own cover image based on that.

Option (e) still requires code changes from what we have now. I forget if leaving out a cover from the publication file will cause an error or lead to an EPUB that doesn't validate. But it's one of those situations.

rbeezer commented 2 years ago

(f) Make calibre a prerequisite, and fail on a failed import?

https://www.mobileread.com/forums/showthread.php?t=291311

Alex-Jordan commented 2 years ago

I got it making a standalone .svg. It's used by the cover-page, and marked up in all the right ways as the cover-image. The EPUB validates. The image shows up on "page 2" after the TOC.

And yet neither Apple Books nor Calibre will not show it as the thumbnail on the bookshelf. Apple Books shows the usual thing it auto-generates and Calibre shows the TOC "page 1" for thumbnail.

So I don't know what's up, but some Googling points out that cover thumbnail display is a separate feature of the reader than the actual EPUB reading. And some/many may just not be built to handle an svg cover image.

So then I thought, hey I have a decent looking svg and I have python. Let's get python to convert to png and use that. But everything I tried requires something extra to install. Either python packages that need installing, or ImageMagick. Given what you said earlier about not relying on having LaTeX, I assume I should not rely on needing extra python packages or ImageMagick. Is that right? Or if the python packages are not a big deal, I could use a try: except: to tell people to install a missing python package.

rbeezer commented 2 years ago

Progress! Brief, on fone.

How about on failed import: we'll make you a customized cover if only you add Python package X, otherwise you make your own or live with this crappy 100% generic cover that we have in the distribution that we will give you right now.

On September 3, 2021 5:18:22 PM PDT, Alex Jordan @.***> wrote:

I got it making a standalone .svg. It's used by the cover-page, and marked up in all the right ways as the cover-image. The EPUB validates. The image shows up on "page 2" after the TOC.

And yet neither Apple Books nor Calibre will not show it as the thumbnail on the bookshelf. Apple Books shows the usual thing it auto-generates and Calibre shows the TOC "page 1" for thumbnail.

So I don't know what's up, but some Googling points out that cover thumbnail display is a separate feature of the reader than the actual EPUB reading. And some/many may just not be built to handle an svg cover image.

So then I thought, hey I have a decent looking svg and I have python. Let's get python to convert to png and use that. But everything I tried requires something extra to install. Either python packages that need installing, or ImageMagick. Given what you said earlier about not relying on having LaTeX, I assume I should not rely on needing extra python packages or ImageMagick. Is that right? Or if the python packages are not a big deal, I could use a try: except: to tell people to install a missing python package.

Alex-Jordan commented 2 years ago

Force pushed and ready to review.

In this commit:

All the above circumstances tested and validated. One way to test all the kinds is to plant something bad like print(x) right within each try:, one at a time, to force it to fail. Earlier it was failing for you anyway with the Arial.ttf version, but if you can get an absolute path to some ttf on your system and temporarily use that instead of "Arial.ttf" and "Arial Bold.ttf", that would test that version too.

Here is what my Apple Books library like after opening the PCC project under each condition:

Screen Shot 2021-09-05 at 12 13 21 AM
Alex-Jordan commented 2 years ago

Over coffee this morning I noticed in the pic I posted that the aspect ratio on two of the auto-generated cover images wasn't 5:8. Make a quick edit, squashed, and force pushed.

Screen Shot 2021-09-05 at 9 14 36 AM

From left to right, with priority: pageres (2) latex (1) Apple Books (last) PIL with bitmap font (4) PIL with Arial (3) publisher provided cover (0)

Now in Apple Books these are all the same, except the Apple Books-generated one doesn't quite have the 5:8 ratio.

rbeezer commented 2 years ago

Thanks, @Alex-Jordan, that is a comprehensive gauntlet to get to a cover image! ;-)

I'd like to hear from @mitchkeller on this one, and will ring him up if this ping doesn't yield results.

Alex-Jordan commented 2 years ago

@mitchkeller If there is a cover image declared in the publication file, the output here should be mostly unchanged. The only difference should be instead of a figure wrapping the image on the cover page, it's a section. The code had a comment reference to a web page explaining the use of figure, but that page is dead. Meanwhile I found another page suggesting using section. Same page suggests a cover image not exceed 3.2 million pixels, so I reduced the "1600x2560" advice in the guide.

Before this, you have to have a cover image in the publication file. So now if you don't have that, at least something comes out and validates. It ought to work with Kindle because it's just a png in all cases. But I did not do Kindle testing.

rbeezer commented 2 years ago

OK, time to release the gauntlet into the wild. ;-)