opendocument-app / OpenDocument.core

C++ library that translates office documents to HTML
GNU General Public License v3.0
23 stars 9 forks source link

css bugs in resource_data.cpp #330

Open jneutron4321 opened 12 months ago

jneutron4321 commented 12 months ago

Hello, I tested some files with the 3.0.0 version, and I found a couple of css bugs in the generated html. The bugs are in the resource_data.cpp file:

1-Open this pptx file: https://freetestdata.com/wp-content/uploads/2021/09/Free_Test_Data_100KB_PPTX.pptx. Nothing is visible because the x-p font size is 0: x-p{display:block;font-size:0} And this is the size that is applied to the inner text. This does not happen with spreadsheets because there is another x-p style that does contain a valid size: x-p{font-family:Arial,serif;font-size:10pt}

2-If we open a spreadsheet that has long text in its cells, the best that we can do is truncate it and display ellipsis. But what happens is that the long text overlaps the text of the cells below it. I think that is caused because td{height:inherit;text-overflow:ellipsis;vertical-align:bottom} is missing overflow:hidden; This fix does truncate the text and thus displays the spreadsheet in a cleaner way, although I have not been able to display the ellipsis character (...).

Also, may be it could be useful to switch between displaying the full text or just its ellipsis version, in a given cell. It is only needed to handle the cell onClick event to adjust the row height by changing the value of x-p display property: x-p {display:block;} for x-p {display:inline}

If you need more details, please let me know!

andiwand commented 12 months ago

Hi @jneutron4321 !

Version 3 is not stable yet and needs more work - maybe I should make this more clear in the Releases.

Can you confirm that these files are displayed correctly in the previous 2.x version? The goal is that V3 will display the same or better in case of fixes.

jneutron4321 commented 12 months ago

Hello @andiwand:

Yes, I can confirm that these files have the same problem in versions 2.1.0 - 3.0.0: /src/internal/resource_data.cpp is basically the same file for them. And probably also version 2.0.0 has the same problem in /src/internal/html/common.cpp

I attached the sample files:

test-files.zip

DIRECTORY: /test-files/ods/ ORIGINAL FILE: test-ods.ods CONVERTED FILE: sheet0.html FIXED FILE: sheet0.html.fix.html

DIRECTORY: /test-files/pptx/ ORIGINAL FILE: ms-presentation.pptx CONVERTED FILES: slide0.html slide1.html slide2.html slide3.html slide4.html FIXED FILES: slide0.html.fix.html slide1.html.fix.html slide2.html.fix.html slide3.html.fix.html slide4.html.fix.html

Honestly, pptx and odp file conversions need big improvements: you can see the text is overlapped; and the original background is not displayed, which is a problem if the text is white (you can't see it). But that is a matter for another issue :)

Best regards!

TomTasche commented 11 months ago

Honestly, pptx and odp file conversions need big improvements

Time is our constraint here, because this is just a side project for us... Are you interested in working with us to improve this library? We've been looking for freelancers for a while now but couldn't find one.

jneutron4321 commented 11 months ago

Hello Tom,

I really appreciate the invitation to work with you, but I am not very good with C++. I feel more comfortable with C#, Java, and HTML /JS.

I wanted to learn to use WebView, and also I needed a simple app just for viewing my ODF files. So I decided to make my own apk. It is very small: the only java file is MainActivity, which is less than 20 kb, and I use your library like a "black box". After fixing the mentioned small css problems (by modifying the output HTML), the library works very well with most of my ODFs (documents and spreadsheets).

In the case of presentations, I examined the sample file and I realized that ODF format is more complex than I thought, and particularly backgrounds and graphics are a challenge. As a quick fix I just set the body background-color, but I am aware that this cannot be solved in a few days.

Best regards!

andiwand commented 11 months ago

Honestly, pptx and odp file conversions need big improvements: you can see the text is overlapped; and the original background is not displayed, which is a problem if the text is white (you can't see it). But that is a matter for another issue :)

Which odp are you referring to here? OOXML definitly needs work - I am mostly happy if the text appears. On the other hand ODF should be rather stable.

2-If we open a spreadsheet that has long text in its cells, the best that we can do is truncate it and display ellipsis. But what happens is that the long text overlaps the text of the cells below it. I think that is caused because

Overflow is difficult to sort out correctly in HTML. I remember that I tried to match this up with LibreOffice but there was no simple way. We basically have to choose if the text should appear correctly or the cell should have the right size

andiwand commented 11 months ago

1-Open this pptx file: https://freetestdata.com/wp-content/uploads/2021/09/Free_Test_Data_100KB_PPTX.pptx. Nothing is visible because the x-p font size is 0:

I looked into this one briefly. The font size for the text in this PPTX seems to be hidden somewhere and I cannot even find it manually. The reason for the default 0 is that this renders empty paragraphs correctly. To resolve this issue we need better understanding of the OOXML.

jneutron4321 commented 11 months ago

Which odp are you referring to here? OOXML definitly needs work - I am mostly happy if the text appears. On the other hand ODF should be rather stable.

Both odp and pptx have a problem when text is white, but yes, odp display is better: at least text is in its position and has a font size >0. Here is the sample file with my fix:

files-odp.zip