pymupdf / RAG

RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
https://pymupdf.readthedocs.io/en/latest/pymupdf4llm
GNU Affero General Public License v3.0
539 stars 82 forks source link

Make it only one pymupdf4llm module instead of two (pymupdf4llm, pdf4llm) #82

Closed dantetemplar closed 3 months ago

dantetemplar commented 3 months ago

I think that deprecating the pdf4llm module will improve the developer's experience:

For example, I had many questions about why it was done this way, and how exactly imports work in the pdf4llm folder.

It is possible to release a patch to pdf4llm that will cause DeprecationWarning when importing a module.

I will attach PR.

dantetemplar commented 3 months ago

I noticed that you do a lot for pymupdf, it's really cool! And I want to help you :3

@JorjMcKie

dantetemplar commented 3 months ago

For deprecating build and publish pdf4llm with changes from https://github.com/dantetemplar/pymupdf4llm/commit/d708fbdc8920b82570867c2fdf9e1dd6455f9ad1

dantetemplar commented 3 months ago

Also I suggest to move out stuff not related to pymupdf4llm package from repository:

JorjMcKie commented 3 months ago

Thank you for your appreciation and your contributions! I am glad you like it. There are good reasons why we have and will keep pdf4llm as an alias. These reasons are not technical, therefore your considerations - while valid from a technical perspective - do not apply here.

jackbravo commented 3 months ago

What are those reasons? 😅