koreader / crengine

This is the KOReader CREngine fork. It cross-pollinates with the official CoolReader repository at https://github.com/buggins/coolreader, in case you were looking for that one.
70 stars 45 forks source link

style element not working #165

Closed Frenzie closed 6 years ago

Frenzie commented 6 years ago

@poire-z While glancing at one of the other forks I spotted https://github.com/blchinezu/pocketbook-coolreader/commit/5af54fd28a88924b3f3c756a55e573060650c22c#diff-1f1938ab45cc1d2548b3ef698620143c so I wrote a quick testcase to see what would happen.

<!DOCTYPE html>
<html>
<head>
<style>
.red{color:red}
</style>
</head>
<body>
<div>
Not red.
    <div class=red>Red.
        <div>Also red.</div>
    </div>
</div>
</body>
</html>

color-css-inheritance.html.zip

The answer would seem to be nothing at all.

screenshot_2018-04-27_16-34-32

Anyway, since you just fixed something about color and borders… :-)

poire-z commented 6 years ago

Displays as expected on my side:

image

Aren't you using MuPDF, or have color rendering disabled ? :)

Frenzie commented 6 years ago

CoolReader with color rendering enabled. MuPDF actually doesn't even want to open that file for me.

Frenzie commented 6 years ago

@poire-z Same on my laptop.

Frenzie commented 6 years ago

As an aside, random error message:

/home/frans/src/koreader/base/thirdparty/kpvcrlib/crengine/thirdparty/chmlib/src/chm_lib.c:269:73: runtime error: left shift of 132 by 24 places cannot be represented in type 'int'
poire-z commented 6 years ago

Strange. No Disable embedded styles? Can you get some color at all with other books?! Do my color test files from a few PR ago shows their colors?

Frenzie commented 6 years ago

Indeed it doesn't. Does it depend on something in base you haven't pushed yet? (Also crengine could use a bump. :-P)

Frenzie commented 6 years ago

But background colors and such work. Edit: and yes, I compiled with the crengine master from here.

poire-z commented 6 years ago

Mhhh no, no deps or other thing I can think of :|

poire-z commented 6 years ago

Any news on that? (ps: that ssh PR is still waiting for you to have a look at the code :)

Frenzie commented 6 years ago

Doesn't make a difference, but it shouldn't have anyway.

If you think the SSH code is fine go ahead and merge it. I won't be able look at it for at least another few hours. :-)

Frenzie commented 6 years ago

@poire-z The problem is not the color. I wanted to test some of that selector selectivity stuff with http://test.csswg.org/suites/css2.1/20110111/html4/id-selector-006.htm (among others) and found that styles from head aren't applied. (I tossed the entire build dir to ensure everything was recompiled.)

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <title>CSS Test: Simple ID selectors specificity over attribute selectors</title>
        <link rel="author" title="Microsoft" href="http://www.microsoft.com/">
        <link rel="help" href="http://www.w3.org/TR/CSS21/selector.html#id-selectors">
        <meta name="flags" content="">
        <meta name="assert" content="ID selectors trump attribute selectors in specificity.">
        <style type="text/css">
            div#div1
            {
                color: green;
            }
            div[id=div1]
            {
                color: red;
            }
        </style>
    </head>
    <body>
        <p>Test passes if the "Filler Text" below is green.</p>
        <div id="div1">Filler Text</div>
    </body>
</html>

So this is actually a problem with #159.

poire-z commented 6 years ago

But how it is with your test case from first post, that works for me and did not work for you?

poire-z commented 6 years ago

It's the presence of the META or LINK that make the STYLE not parsed/applied. It works when these are removed.

Frenzie commented 6 years ago

None of this stuff does anything in either test case. Inlining the styles in a STYLE attribute does.

poire-z commented 6 years ago

Mhhh, so we're still at step 1 (with the added bug I should fix about head styles not taken if additional META or LINK).

We need a 3rd report... @cramoisi : can you test the test case of first post in your emulator?

poire-z commented 6 years ago

@Frenzie : btw, it's included in current nightlys: can you test it on your kobo?

Frenzie commented 6 years ago

@poire-z I already tested it on two distinct computers, Kobo, Android and AppImage, the latter three from the nightly build server. It's most definitely not working.

poire-z commented 6 years ago

Mhhhh. So none of the other HTML snippets I posted this last days would work? Can you test a few of them? I tested 3 of them on my kobo (with just replacing libcrengine.so on an older nightly thus)., and they were working. (I don't have my device at hand to confirm which they were - and I probably didn't test this color-css-inheritance.html.zip (of course, you're testing the .html unzipped, right?))

Frenzie commented 6 years ago

Huh, it also works zipped. Cute. Doesn't behave any differently either way.

The border-bottom stuff from #159 doesn't work, no. What snippets?

Frenzie commented 6 years ago

Oh, it's an embedded style thing!

Frenzie commented 6 years ago

Not in the sense that I had it off. But toggling it off and on again makes it work, at least until you reload the document.

poire-z commented 6 years ago

Oh, so you can debug that behaviour ! I got it once, it disappeared after I Purged .sdr ... couldn't reproduce it after that (didn't keep a copy of the metadata.lua...) so couldn't follow up on understanding that bug... But it looks like it's something related to settings: global vs docsettings, may be tweaked when you once went see this document (or any document) with MuPDF ?

What snippets?

172, 166, koreader 2841 from 170 (no hash to not cross reference)

Frenzie commented 6 years ago
  1. There was no previous .sdr because I generally do quick tests in /tmp.
  2. Even if there were, you think I previously opened a file named CSS Test: Simple ID selectors specificity over attribute selectors.htm? :-P

If anything the problem would be the reverse, imo. Something insufficiently vanilla about your testing. :-)

Frenzie commented 6 years ago

PS MuPDF actually doesn't even open the document in the OP because it's not valid XML. It'd need class="red" rather than class=red.

05/04/18-12:35:15 WARN  cannot open document /tmp/CSS Test: Simple ID selectors specificity over attribute selectors.htm frontend/document/pdfdocument.lua:33: ./ffi/mupdf.lua:67: MuPDF cannot open file.: missing quote character (2)

But no, I made particularly sure to avoid MuPDF by avoiding the .xhtml extension. I hadn't tried it the testcases in it either.

poire-z commented 6 years ago

Something insufficiently vanilla about your testing. :-)

Mhhh, I get the same behaviour as you do (no red, toggle off Embedded style and back: red present) after I removed my settings.reader.lua...

Frenzie commented 6 years ago

Playing around a bit I suspect you might've had hyphenation set to something non-default (like French) triggering a rerendering on first load.

Other options like floating punctuation or embedded styles don't permanently work around the problem on reloading.

poire-z commented 6 years ago

Indeed, just setting hyphenation to French makes it red on loading.

edit: and putting back my good old settings.reader.lua, and just removing from it all the hyph_* settings, is enough for head styles to not work...

cramoisi commented 6 years ago

hopla, need something to test ?

poire-z commented 6 years ago

@cramoisi : thanks, but it's no more needed, we found out how to trigger it. It looks like some other koreader issue (that I remember you helped testing), where embedded styles weren't applied until you toggle them off and on.

cramoisi commented 6 years ago

@poire-z Yep, it was a common thing when i started using KOReader and was searching for the best setting. somebody opened an issue with this not long ago

I need to update my emu anyway

cramoisi commented 6 years ago

https://github.com/koreader/koreader/issues/3411

it's totally the same think isn't it ?

poire-z commented 6 years ago

Thanks. Somehow, I can see this problem when hyphenation hasnt't been tweaked, but on HTML only: for EPUB, embedded styles are applied correctly. So, it does not look exactly as what's in https://github.com/koreader/koreader/issues/3411, as the tests there were made with epub. But may be we'll find a common root cause...

poire-z commented 6 years ago

I'll push a PR with a fix for this lack of initial re-rendering when there is a head style.

About the HTML snippet from https://github.com/koreader/crengine/issues/165#issuecomment-386538957 , the style is not applied because of the not self closing <META> and <LINK>. The autoclose algorithm actually builds this from the source HTML:

<?xml version="1.0"?>
<html>
  <head>
    <title>CSS Test: Simple ID selectors specificity over attribute selectors</title>
    <link rel="author" title="Microsoft" href="http://www.microsoft.com/">
      <link rel="help" href="http://www.w3.org/TR/CSS21/selector.html#id-selectors">
        <meta name="flags" content="">
          <meta name="assert" content="ID selectors trump attribute selectors in specificity.">
            <style type="text/css"> div#div1 { color: green; } div[id=div1] { color: red; } </style>
          </meta>
        </meta>
      </link>
    </link>
  </head>
  <body>
    <p>Test passes if the "Filler Text" below is green.</p>
    <div id="div1">Filler Text</div>
  </body>
</html>

and what I added (similar to EPUB) expects <style> to be a direct child element of <head>: https://github.com/koreader/crengine/blob/c556565994bd67d062ee0302b8c3227401206ca2/crengine/src/lvtinydom.cpp#L8076-L8078

I don't want to mess with the autoclose algorithm. I'm not even bothered with how <META> and <LINK> could make head style not working, but if this should really be fixed, I could use another flag _inHead (set when entering HEAD) instead of looking at _currNode->getElement()->isNodeName("head") after I have a style candidate. edit: looks easy, let's do that.

Frenzie commented 6 years ago

if this should really be fixed

Um, the disasters are surely lurking everywhere if that's what it does instead of a halfway correct behavior? Elements like LINK and META don't come with end tags, so you'll get the same problem with the TITLE, and everything else we care about. (meta name=description, for example?

Where do you get all this interesting debugging output from btw? :-P

Does this mean IMG, HR, BR and the like also break down completely?

poire-z commented 6 years ago

Where do you get all this interesting debugging output from btw? :-P

https://github.com/koreader/crengine/blob/c556565994bd67d062ee0302b8c3227401206ca2/crengine/src/lvdocview.cpp#L33-L34

https://github.com/koreader/crengine/blob/c556565994bd67d062ee0302b8c3227401206ca2/crengine/src/lvdocview.cpp#L4278-L4281

so you'll get the same problem with the TITLE, and everything else we care about

It was a problem because my code explicitely expected the head to be the parent. I hope the existing code does not make this naive assessment.

Does this mean IMG, HR, BR and the like also break down completely?

I don't want to know :| (but I guess it would not matter for rendering if a </img> </HR> was to happen far away down, may be just for ancestor css selector and the like).

Frenzie commented 6 years ago

It was a problem because my code explicitely expected the head to be the parent. I hope the existing code does not make this naive assessment.

I would not be inclined to think of essentially correct behavior as a problem but I'll have a look at the spec and what other parsers do. TITLE might be a bit of a special case.

poire-z commented 6 years ago

Testing IMG, HR, BR, it has auto closed them correctly:

        <p>Test passes if the "Filler Text" below is green.</p>
        <div id="div1">Filler Text</div>
        <div><img src="toto.gif">this is some text<span>truc</span></div>
        <div><img src="toto.gif">this is <hr>some<br> text<span>truc</span></div>

=>

    <p>Test passes if the "Filler Text" below is green.</p>
    <div id="div1">Filler Text</div>
    <div><img src="toto.gif"/>this is some text<span>truc</span></div>
    <div>
      <autoBoxing><img src="toto.gif"/>this is </autoBoxing>
      <hr/>
      <autoBoxing>some</autoBoxing>
      <br/>
      <autoBoxing> text<span>truc</span></autoBoxing>
    </div>

so, may be there's something somewhere to do that correctly for other tags...

Frenzie commented 6 years ago

Excellent. With any luck there's a simple list/enum type thing somewhere.

Frenzie commented 6 years ago

Here we go: https://github.com/koreader/crengine/blob/c556565994bd67d062ee0302b8c3227401206ca2/crengine/src/lvxml.cpp#L3941-L3990

I'll see if I can find a list of self-closing HTML elements in the meantime, but I can tell you for sure that META and LINK should be added to that list. {"meta", NULL}, looks simple enough in theory.

Frenzie commented 6 years ago

This list may not be exhaustive but it's a good start:

https://www.w3.org/TR/xhtml-media-types/#C_2

Empty elements in the XHTML family include: area, base, basefont, br, col, hr, img, input, isindex, link, meta, and param.

And I know for sure isindex was just something that was in the spec only more or less temporarily. Not that we worry about inputs anyway, but I figured I'd point it out for onlookers.

Frenzie commented 6 years ago

Aha! MDN to the rescue with a good overview:

https://developer.mozilla.org/en-US/docs/Glossary/empty_element

Frenzie commented 6 years ago

@poire-z Obvious fix confirmed working with your writing to disk option. We can just add a few extra definitions like static const char * AC_INPUT[] = {"input", NULL};. I can do it if you like or you can add it to your existing PR (while removing that inhead thing ;-) ).

poire-z commented 6 years ago

Feel free to do it :) (I'll clean my inhead PR after it)