Closed narsandu closed 5 months ago
You can already do that today! So ~both~ all the following works exactly the same:
import pymupdf4llm
data = pymupdf4llm.to_markdown("input.pdf")
From Document:
import pymupdf4llm
import pymupdf
doc = pymupdf.open("input.pdf")
data = pymupdf4llm.to_markdown(doc)
Or a bytes
, bytearray
, io.BytesIO
object:
import pymupdf4llm
import pymupdf
import pathlib
pdfdata = pathlib.Path("inut.pdf").read_bytes() # make a memory-resident PDF
doc = pymupdf.open("pdf", pdfdata) # open a memory-based PDF as a Document
data = pymupdf4llm.to_markdown(doc)
thanks
Instead of file path in string would also like to pass fitz object to 'to_markdown' method can you update the input params to accept either file_path or fitz object