-
### Question
Hi.
In the cookbook about PDF extraction https://docs.mirascope.io/latest/cookbook/extract_from_pdf/, `PyMuPDF` is claimed to be used in the text, but all the examples feature package n…
-
This is a feature request.
As Python indexing starts at `0`, it would be natural to have the option to start the `pdf2image.generators.counter_generator` starting at `0`. Currently, it can only sta…
-
支持文件与文件夹快速进行转换,处理当拖动带空格的路径系统会自动加上引号。
```python
import datetime
import os
import fitz # fitz就是pip install PyMuPDF
def pyMuPDF_fitz(pdfPath, imagePath, amplification):
startTime_pdf2img =…
-
**Is your feature request related to a problem? Please describe.**
YES. I'm doing a RAG on an group of brazilian laws and I think that the problem applies to all RAG/LLM community.
(I'm new to RAG)…
-
https://pymupdf.readthedocs.io/en/latest/app1.html#performance
-
In PyMuPdf I can do this, on pdf files that with text layer:
```
text_dict = page.get_text("dict")
for bl in text_dict['blocks']:
for line in bl.get('lines', []):
for s…
-
In `project.toml` we list pyPDF2 as a dependency. Do we actually use this? I wasn't able to find any import of that. I mostly ask because upstream I've added a dependency on pyMuPDF, which apparent…
-
### Description of the bug
Not sure if this should be reported in mupdf repo instead, please let me know.
### How to reproduce the bug
The trigger for this crash is somewhat convoluted. KiCad EDA…
-
trying to install `pdf2docx`
- pymupdf = "nixpkgs" ;
---
`"refs/tags/3.4.0"`
using `python3.x` and `python3.xFull`
`The Package 'tkinter' is not available from any of the selected provider…
-
Hi there, when I create a word document that contains a single table (e.g., with 6 columns and 6 rows) and I insert some dummy text and save it as pdf, `to_markdown` throws an error if `extract_words=…