VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy
https://www.datalab.to
GNU General Public License v3.0
16.82k stars 955 forks source link

Found nonstandard filetype xml 1.0 document, ascii text, with very long lines (2528) #103

Closed SmallBlueWolf closed 5 months ago

SmallBlueWolf commented 5 months ago

I installed it as described in the Readme file, but when convert_single was performed, the following error occurred; What I converted was a pdf file that would open normally, and I don't know why this happened.

used instruction: python convert_single.py ~/bluewolf/0001.pdf ~/bluewolf/ICLR2013/0001.md image

The environment is Ubuntu22.04, using a 2080tiGPU, and CUDA can be correctly detected when torch is tried.