sungaila / PDFtoImage

A .NET library to render PDF files into images.
https://www.sungaila.de/PDFtoImage/
MIT License
144 stars 14 forks source link

https://www.sungaila.de/PDFtoImage/ Conversion loss text #64

Closed sibo-git closed 5 months ago

sibo-git commented 5 months ago

PDFtoImage version

3.0.0

OS

Linux

OS version

Ubuntu22.04

Architecture

x64

Framework

.NET (Core)

App framework

Asp.NEt Core

Detailed bug report

image 转换后丢失了好多数据

sibo-git commented 5 months ago

image source pdf file

sibo-git commented 5 months ago

Is it related to Chinese? I found that all Chinese characters have not been converted into images as part of the text.

sungaila commented 5 months ago

Hello @sibo-git, your problem might be related to this issue: #55 Sometimes the text is not rendered correctly if the output image is too big.

Have to tried to lower the DPI setting from 300 to something like 100?

sibo-git commented 5 months ago

Hello, I think my problem is due to font issues, as my image size is completely displayed, but some parts of the Chinese text will not be displayed. I use online https://www.sungaila.de/PDFtoImage/ After trying to adjust 100, the problem still exists

sibo-git commented 5 months ago

内蒙古交科路桥建设有限公司加油站-96.00-2023年08月10日.pdf the source file,

sungaila commented 5 months ago

Can confirm, the text is not rendered correctly. However, this seems to be a problem related to PDFium because the text is missing on Google Chrome and Microsoft Edge as well. So there is nothing I can do about it myself (this library is just a wrapper for PDFium).

Adobe Acrobat Reader renders the PDF just fine so this proves your PDF is not corrupted.

sibo-git commented 5 months ago

thanks,This issue only exists on Linux and not on Windows, so I am looking for a solution to add fonts to pdfium

sungaila commented 5 months ago

Unfortunately, this looks like a font issue in PDFium that I cannot build a workaround for. You can create a ticket on the PDFium bugtracker and hope someone gets this fixed.

Once a bugfix in PDFium was made, this library PDFtoImage will be updated and fixed as well.