Closed bbfrog closed 1 month ago
Running this script works in both cases:
import pymupdf4llm
import sys
import pathlib
filename = sys.argv[1]
md = pymupdf4llm.to_markdown(filename, margins=0)
pathlib.Path(filename + ".md").write_bytes(md.encode())
It runs a while of course, because both pages contain more than 1000 drawings.
thanks very much!
Here are two example PDFS: IASLC.pdf ENA.pdf
They only have 1 page and multiple panels. When i tried pymupdf4llm.to_markdown, the progress is 1/1 which looks like done, but the program is stuck from there. They maybe hard cases, please let me know whether they can be fixed. Thanks!