koreader / crengine

This is the KOReader CREngine fork. It cross-pollinates with the official CoolReader repository at https://github.com/buggins/coolreader, in case you were looking for that one.
70 stars 45 forks source link

Various in-page footnotes, tables and covers fixes and tweaks #558

Closed poire-z closed 5 months ago

poire-z commented 5 months ago

In-page footnotes: avoid with -cr-hint: noteref-ignore

This hint was initially added to prevent footnote popups by frontend, and is rarely useful and used. Have it also prevent in-page footnotes from being added to the pages of links with that hint.

Already documented at https://github.com/koreader/koreader/blob/67cd647d1a6a36c95d512b1ee461b7a7faf484c4/frontend/apps/reader/modules/readerstyletweak.lua#L882-L885 :

Can be set on links (<a href='#..'>) to have them NOT trigger footnote popups and in-page footnote. If some DocFragment presents an index of names with cross references, resulting in in-page footnotes taking half of these pages, you can avoid this with: DocFragment[id$=_16] a { -cr-hint: noteref-ignore }

In-page footnotes: ensure they don't cross "flows"

When footnotes are pushed on a next page, make sure this page ends up in the same "flow": if the next content is part of another "flow", emit a page with only these footnotes, and create a new page for the coming content. See https://github.com/koreader/crengine/pull/443#issuecomment-2001964918 for details and screenshots of the issue.

Tables: fix rendering when negative text-indent

Tweak for table cells similar to how we do with negative text-indent in normal paragraphs. In getRenderedWidths(), try to properly size them when overflow are not allowed ("book" render mode), although this won't handle all cases.

FB2 cover drawing: ensure _invertImages flag

FB2 cover drawing uses a specific codepath: update it so it behaves as all other images drawing in regards to the invertImage setting set on the main DrawBuf. Should allow closing https://github.com/koreader/koreader/issues/11574.

EPUB: fallback to look for a cover in the first fragment

If no EPUB3 or EPUB2 cover found, look at the first XHTML fragment from the spine: if it contains a single image and no text, assume that image can be used as a cover. Also accept an image with <img alt"=cover"> in this first fragment if it comes before any text or other images. See https://github.com/koreader/koreader/issues/11571#issuecomment-2008099189. It uses the idea (and early patch, adapted) described at https://github.com/koreader/koreader/pull/11491#issuecomment-1962733691 and followups, that ended up not making it into crengine, as just fixing support for EPUB3 covers was enough (done in #555). Should allow closing https://github.com/koreader/koreader/issues/11571.


This change is Reviewable

poire-z commented 5 months ago

Please provide 1 real-life EPUB out of these 8300 :) We expect that first fragment to have no text, which should rule out many cases. We'll see if we need to have more heuristics about the image, its size, etc...

Frenzie commented 5 months ago

All of them follow this pattern: https://www.dbnl.org/tekst/lapi001goet01_01/ebook/lapi001goet01_01.epub

poire-z commented 5 months ago

By "false negative", you mean this commit is not enough to detect the cover in this book ? Because it indeed does not detect that cover: the first fragment contains an image, which is indeed a cover and would be usable as a cover, but also some additional follow up text that prevent my code to trigger and return that image as a cover.

Frenzie commented 5 months ago

Exactly, it's a false negative because of the text rule.

poire-z commented 5 months ago

Ok, so I'm fine with this, if it is missing some cases, better false negatives than false positives. If there's something to do about such books, it can be implemented in a future way to fetch it.

It's not obvious what should be done though, and if anything can be done. My calibre shows it as: image It looks like it hasn't extracted the image, but rendered the fragment, and picked a screenshot of the first page (fortunately, the HTML+CSS in that first fragment push a page-break after the cover image). We probably won't go that way.

There could probably be other ways, ie. if checking if the first image has a alt="cover" (which it has in this book), and in that case pick it and don't fail if there are text content. Feels like not complicated to do. Does this heuristic(s? singular or plural :)) feels ok to you?

Frenzie commented 5 months ago

We're talking about a single heuristic addition, right? Yes, I think alt=cover (lowercased for comparison) sounds like a sensible thing to use.

poire-z commented 5 months ago

Yes. It's either, in the first XHTML fragment:

Frenzie commented 5 months ago

Sounds good to me.