pymupdf / RAG

RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
https://pymupdf.readthedocs.io/en/latest/pymupdf4llm
GNU Affero General Public License v3.0
518 stars 81 forks source link

Pprados/fix password #170

Closed pprados closed 2 weeks ago

pprados commented 1 month ago

Add the password to load the PDF file.

pprados commented 3 weeks ago

@JorjMcKie can you revue this PR ?

JorjMcKie commented 2 weeks ago

This is unnecessary because you always can use a Document object instead of a filename string. If we were to start changes like these we would replicate code that is already present in the parent package PyMuPDF. I have once already rejected a change that requested support of pathlib.Path specifications instead a string-based filename. To support decryption simply do this:

doc = pymupdf.open("encrypted.pdf")
doc.authenticate(password)
md = pymupdf4llm.to_markdown(doc, ...)
pprados commented 2 weeks ago

ok