pymupdf / RAG

RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
https://pymupdf.readthedocs.io/en/latest/pymupdf4llm
GNU Affero General Public License v3.0
303 stars 57 forks source link

Specified encoding when opening to fix undefined characters #1

Closed dragoa closed 5 months ago

dragoa commented 5 months ago

By specifying encoding="utf-8", you ensure that the file is encoded using UTF-8, which can handle a wider range of characters without raising encoding errors

JorjMcKie commented 5 months ago

Thank you for your PR. You are quit right - I have changed the script to using UTF8-encoded text output.