koreader / crengine

This is the KOReader CREngine fork. It cross-pollinates with the official CoolReader repository at https://github.com/buggins/coolreader, in case you were looking for that one.
72 stars 45 forks source link

CSS: parse/skip at-rules, support `@media`, `@supports` #454

Closed poire-z closed 3 years ago

poire-z commented 3 years ago

#### (Upstream) Minimal support for zip64 postponing this one for another PR

~~This is enough to open zip archives larger than 4GB. From https://github.com/buggins/coolreader/pull/310 - not sure we're using it as it depends on a #define - a bit of refacto to bits of code we use, hope it's fine.~~

XML parsing: strip leading \n in PRE and TEXTAREA

See https://github.com/koreader/crengine/pull/345#issuecomment-912673645 for details. "For historical reasons, in the HTML syntax, a leading newline character immediately following the pre element start tag is stripped." Bump DOM version as this can shift highlight xpointers indexes by 1. (It may still cause a shift for very old opened books that will get migrated from that very old DOM version to this one... but well, who highlights in PRE ?!! Or I could do this inconditionnally: https://github.com/koreader/koreader/blob/ab4e27908b292ab7172f06bb2b1e203558780d96/frontend/apps/reader/modules/readerrolling.lua#L1415-L1420

Text: allow wrap on any space in "white-space: pre"

Note: there might still be something fishy with multiple spaces at boundaries between pre and non-pre text nodes, possibly overlapping. Before, a long sequence of space was unbreakable, so the possible break was found by hyphenating the previous word (and the next break is because there's no other solution): image After: image The fix looks simple and sane - but well, with our text rendering code, we never know... I think we have many other issues with white-space:pre (like what we do this, we don't do it in gerRenderedWidths()), but well, didn't have the courage for more...

initTableRendMethods(): avoid expensive processing

Not sure why we went with initNodeRendMethodRecursive(), it feels we just want to set it to this single node, whose inner content has already been inited, and shouldn't need them changed. Fixes the 2nd issue noticed at https://github.com/koreader/crengine/pull/370#issuecomment-910156921:

More serious issue is with tables: The table in 13.9 of https://diveintopython3.net/serializing.html would take a lot of time to render [...] Add another layer ot tr > th > td, and it takes minutes... It gets stuck into nested initTableRendMethods() initNodeRendMethodRecursive() recurseElementsDeepFirst(), so we probably have a real issue here.

Tables: fix handling of whitespace between nodes

Properly handle whitespaces between table element nodes according to the specs. This fix was mostly not needed with normal tables as the XML parsers would remove these whitespace text, and the table rendering algorithm would ignore them. But it might be needed when completing incomplete tables (where these whitespace may have been kept by the parser and should, or not, be included with what is wrapped) or when table elements are white-space:pre (which is rare). We probably won't trigger this in real-life cases, but I had the bad idea to wrap all my "complete incomplete tables" test case in white-space:pre, and we did render mostly everything unlike Firefox. Got to answer a few of my questions in the comments... A bit risky though (had to fight with "styles have changed in such a way..." on re-rendering), hope it will be ok.

ldomDocument: store screen size

Have LVDocView store the screen size in ldomDocument just so we'll be able to check CSS media queries like @media (min-device-width:500px) or (device-aspect-ratio:4/3)

CSS: more robust skipping of invalid content

Add a generic parsing helper skip_to_next() that properly observes the rules for matching pairs of (), [], {}, "", and '', and correctly handle escapes, as per specs. Use it for next_property() and next_token(). Will be also of use when parsing CSS @-rules.

CSS: parse/skip at-rules, support @media, @supports

Properly parse CSS @-rules, skipping most of them (by handling their content as nested stylesheets or nested declarations) as we don't support much except @media and @supports. Implement a AtRuleLogicalConditionParser and dedicated subclasses to parse logical combinations in @media and @supports. Implement handling of @supports items by tweaking LVCssDeclaration::parse() and parse_content_property() so that the same code can be used for parsing and just checking if we would parse it fine. References: https://developer.mozilla.org/en-US/docs/Web/CSS/At-rule https://developer.mozilla.org/en-US/docs/Web/CSS/Media_Queries/Using_media_queries https://www.w3.org/TR/css3-mediaqueries/#media0 https://drafts.csswg.org/css-conditional/#at-supports

Some previous discussion in https://github.com/koreader/koreader/issues/3179 Should allow closing https://github.com/koreader/koreader/issues/7199.

CSS: handle media condition with @import and <link/>

Add support for:

  <link href="s.css" type="text/css" rel="stylesheet"
      media="screen and (orientation:portrait)"/>

and

  @import "s.css" screen and (orientation:portrait);

Tweak stylesheets handling in ldomDocumentWriters: Parse and handle media= in <link href="foo" media="bar">, and translate them into @import url("foo") bar; in the <stylesheet> child element's text we put them in. Rework a bit ldomDocumentFragmentWriter to make this parsing a bit clearer.

Handle media conditions in LVProcessStyleSheetImport().

No change to LVLoadStylesheetFile(), used by frontends to get the user-agent stylesheet, which still handle only a unique first @import, without media condition support.

Should allow closing https://github.com/koreader/koreader/issues/3179.

CSS: fix parsing of "font-family:inherit!important"

When without any space.

LDOMNameIdMap::deserialize(): fix when >1024 items

Our max nb of elements and attributes IDs starts with 1024, but can grow if we meet more. Be sure we grow them too as needed when loading from cache. Should avoid the issue noticed at https://github.com/koreader/koreader/issues/7343#issuecomment-914061568.


(I apologize to the 3 users that decided to pick the nicks @media @supports @import. They are in my commit messages, so for any people that will sync their crengine fork with this repo, they may get a mail - but I guess that's not new to them and that by now, they have disabled such github mails :)


This change is Reviewable

poire-z commented 3 years ago

@virxkane : I picked your commit without really studying it. clang-tidy in our CI complains about: crengine/src/lvstream.cpp:2770:17: warning: Potential leak of memory pointed to by 'item' [clang-analyzer-cplusplus.NewDeleteLeaks] and stuff related to your added NextOffset. Details in https://github.com/koreader/crengine/pull/454/checks?check_run_id=3527231130

Hope I didn't do an error when picking :/ Can you have a look at this clang output and see if this is an issue (if not, we should find a way to silence it).

virxkane commented 3 years ago

I picked your commit without really studying it. clang-tidy in our CI complains about: crengine/src/lvstream.cpp:2770:17: warning: Potential leak of memory pointed to by 'item' [clang-analyzer-cplusplus.NewDeleteLeaks] and stuff related to your added NextOffset. Details in https://github.com/koreader/crengine/pull/454/checks?check_run_id=3527231130

Should be fixed in https://github.com/virxkane/coolreader/commit/11d48c157e426acf8e950af674dba0dbe6037536

virxkane commented 3 years ago

not sure we're using it as it depends on a #define - a bit of refacto to bits of code we use, hope it's fine.

zip64 is only relevant if 'large file support' is enabled, so we check LVLONG_FILE_SUPPORT==1.

poire-z commented 3 years ago

Thanks! clang-tidy no longer complains about it ! (Still 2 other minor complains I can't understand, but ok.)

zip64 is only relevant if 'large file support' is enabled, so we check LVLONG_FILE_SUPPORT==1.

Yes, our build stuff vs. crsetup.h is a bit hard to grasp, but I think we don't enable it.

poire-z commented 3 years ago

Added a commit to fix the issue noticed at https://github.com/koreader/koreader/issues/7343#issuecomment-914061568.

poire-z commented 3 years ago

Postponing the zip64 stuff to an other PR, with more stuff from https://github.com/buggins/coolreader/pull/312.

ptrm commented 3 years ago

That is great news! Thanks!

For the sake of css tweaks I would like to also get specific answer: @media (orientation: portrait) { body { font-size: 9pt; } } is still not supported, right? I tried this with no results, but want to double check;) I'm thinking of conditional stuff (like bigger margin on the button-facing screen edge on Forma or Libra) that could live in my tweaks instead of modified epubs :)

poire-z commented 3 years ago

This in a user style tweak works for me:

@media (orientation: portrait) { body { color: blue; } }

Note that it's the orientation of the "page" (where the BODY is rendered, excluding KOReader's own margins set via the bottom menu), and not of the "device screen". So, in two-column mode in landscape, you may still trigger orientation: portrait

(Beware your 9pt will make your font a fixed size, and the only way to change it then is thru Zoom/DPI)

ptrm commented 3 years ago

(Beware your 9pt will make your font a fixed size, and the only way to change it then is thru Zoom/DPI)

Sure, that was just an example :)

So, in two-column mode in landscape, you may still trigger orientation: portrait

True. Same goes for width queries (match the column width), which has its upsides tbh :)

It must have been a typo with orientations, after checking max- and min-width, orientation is working :) Crazy thing is that min-color is working, too :D

I'm very glad it works inside css files too, anyway, will keep fiddling with it and report if need be :))))

poire-z commented 3 years ago

Crazy thing is that min-color is working, too :D

That is a crazy lie :) : https://github.com/koreader/crengine/blob/69b3f55df57be341f2f7d9426454e532f81327f0/crengine/src/lvstsheet.cpp#L1812-L1825

ptrm commented 3 years ago

That is a crazy lie :) :

Wait, I swear on my PW4 (min-color: 12) was enabling the tweak inside the query, while (min-color: 16) wasn't :o

EDIT: https://github.com/koreader/crengine/blob/69b3f55df57be341f2f7d9426454e532f81327f0/crengine/src/lvstsheet.cpp#L2032

Ohhh, it was 8 not 12 in min-color that worked, and it makes sense if color value is about bits per channel, which it is per spec. I mistook it for color-index :)