mvf / qolibri

Continuation of the qolibri EPWING dictionary/book reader
GNU General Public License v2.0
171 stars 13 forks source link

Weird formatting with spaces #31

Closed jojosoft closed 3 years ago

jojosoft commented 4 years ago

Hello, it's great to see this project being developed further! I just updated from version 1.9 (compiled for Qt4 from here) to the newest release.

My question is: Why is the formatting of entries completely different compared to the "old" version I used up until now? Spaces are inserted in groups of three at seemingly random places. I haven't been able to alter this behaviour through CSS, as it's not a margin. There also seems to be an empty line after each entry, but I would prefer the more compact version for less scrolling.

To demonstrate what I mean, here is a screenshot of two entries from version 1.9: old

And here is a screenshot of the same two entries in the current version: new

I don't know if all the spaces are actually in the data, but if they were, I think it would be better to ignore them like it was done in the old version. Any thoughts?

mvf commented 4 years ago

Thanks for the detailed report! This has actually been bothering myself too lately. I never really used the old version much and wasn't aware that it rendered differently there.

Turns out the spaces are inserted by qolibri in code that handles indentation markers in the data. This code seems to never have worked correctly. In the last official version, 1.0.4, the original author just disabled it, stating that applications which only support an HTML subset probably won't be able to display this properly (268894423cf0a6902aedaf4c0ce8a35865cb2d0a). They then re-enabled it later (12b39420fbcbecc278eebb86b99f020c6de13854) and it seems people have been trying to get it to work ever since (a94e838bf9c208cfdff0fc51f3ff4087c1514504, b117a9d112508bb70c9d24a4306bc5928bc4d5b9).

Until there's a fix, the only workaround seems to be disabling the buggy code again by removing the following line in src/ebhook.cpp:

     ...
 EB_Hook hooks[] = {
     //HOOK_S(BEGIN_NARROW),
     //HOOK_S(END_NARROW),
     HOOK_S(BEGIN_SUBSCRIPT),
     HOOK_S(END_SUBSCRIPT),
-    HOOK_S(SET_INDENT),
     //HOOK_S(NEWLINE),
     HOOK_S(BEGIN_SUPERSCRIPT),
     HOOK_S(END_SUPERSCRIPT),
     ...
jojosoft commented 4 years ago

Thank you so much for figuring this out! After compiling qolibri myself with the mentioned line commented out, everything looks as usual. It was also interesting to see all the changes to this feature over time. I wonder what the indentation markers actually represent? It's probably not the absolute indentation level of lines. (Since that is how it's interpreted right now. And I wouldn't think it is actually supposed to look like this?) If I was going to fix this, I would first look at how specific entries render in one of the old, proprietary viewers.

My best guess would be to interpret "0" as "same indentation level as previous line" instead of "no indentation". Also, any indentation should probably be done through CSS, since inserting  s for this purpose makes text annoying to copy from. Anyway, thanks for your reply! Maybe I'll find some time to experiment with this in the future...

mvf commented 4 years ago

I looked into this a bit more. This debug rendering suggests that the "set indent" markers really just set the absolute indentation of paragraphs: debug1 The way I see it, line 1 should be indented by 1 and all the other lines by 2.

The indents are also used for tabstops, like so: debug2 CSS won't suffice here because Qt only supports indenting blocks, such as <p> and <div>. This picture also highlights why spaces don't work: The indentation breaks when a paragraph wraps.

will-crawford commented 4 years ago

Could nest a <div> to get an indented block?

mvf commented 3 years ago

@will-crawford The trouble is that there can be multiple indents per line, and <div> starts a new block.

I have experimented with <table> and it looks great with 広辞苑第六版. But as always the challenge is to make it work consistently across all books, what with EPWINGs being so inconsistent.

Anyway, closing this with 4560b6e9e52a6b196ad6e32b0aa6f75e8258bfa4. The formatting may not be perfect, but at least it's not weird anymore.