Open lavens opened 3 weeks ago
Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/
Hi @lavens , thank you for reporting this issue with the sample file and code. We were able to reproduce it and we have created a ticket to look into it. We will write an update as soon as we figure this out.
Description
When I extract text from a pdf that contains a table, where the table content is formatted with
underline
, each newline of text within a cell is reversed. Once theunderline
formatting is removed, the text is extracted in order as expected.Expected Behaviour
I am able to extract text from a table with the order of text preserved regardless of the formatting applied.
Actual Behaviour
Steps to reproduce the behaviour:
Attachments
Without Underline Formatting.pdf With Underline Formatting.pdf
Output with underline formatting:
Output without underline formatting: