ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.46k stars 356 forks source link

transfer error:unsupported colorspace for '{output}' #277

Closed plainee closed 1 week ago

plainee commented 5 months ago

the log is as below:

[INFO] [1/4] Opening document... [INFO] [2/4] Analyzing document... unsupported colorspace for '{output}'

Keguans commented 5 months ago

It seems to be encountering a currently incompatible or unsupported colorspace in PDF. I also encountered this problem yesterday and I hope the author can pay attention to it.

greendreamer commented 2 weeks ago

value error occurs when colorspace is unsupported for the image type. In this case, you can prevent / skip the exception like: image.tobytes('jpg') Or can replace valid image in your pdf.

and will fix wrong output.

greendreamer commented 2 weeks ago

some logic has been implemented for pixmap in pdf2docx and updated in several versions. there may be some issues. Please provide example pdf.

greendreamer commented 1 week ago

Closing this for lack of reaction for an extended amount of time. Feel free to open a new issue - however please with a reproducing example.