radkovo / Pdf2Dom

Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processed. A command-line utility for converting the PDF documents to HTML is included in the distribution package. Pdf2Dom may be also used as an independent Java library with a standard DOM interface for your DOM-based applications or as an alternative parser for the CSSBox rendering engine in order to add the PDF processing capability to CSSBox. Pdf2Dom is based on the Apache PDFBox™ library.
http://cssbox.sourceforge.net/pdf2dom/
GNU Lesser General Public License v3.0
179 stars 71 forks source link

font size is changing #28

Open 1289naveen opened 6 years ago

1289naveen commented 6 years ago

while converting from pdf to html the font size in the style tag is changing so the overlapping of text problem is coming.please check that Thank you.

AdeshAtole commented 6 years ago

@1289naveen Can you provide the PDF you used?

1289naveen commented 6 years ago

sampletesting2.pdf once check this(this is sample file.in this file only some part of the file is overlapping.I wont share my file due to sensitive information so please check the attached file)

xishaoisnewer commented 6 years ago

hi all ,i met the same porblem,someone have a solution

default
weitiancai commented 5 years ago

oh! bad luck, i met the same problem too ... image i think the only way to fix it is to change the every html tag's width or letter-spacing or fontsize ,but how can i make it? is there any function can do that ?

d55rrr commented 5 years ago

I have solved this problem. In my case,the overlapping is caused by minus letterSpacing value. I modify the constructor of Class BoxStyle ,add : if(src.getLetterSpacing()<0) { this.setLetterSpacing(0); }

l7810 commented 2 years ago

oh! bad luck, i met the same problem too ... image i think the only way to fix it is to change the every html tag's width or letter-spacing or fontsize ,but how can i make it? is there any function can do that ?

Has the problem been solved?