-
## Description
A dataset that can be used to read PDF documents and extract relevant parts such as tables, figures, and textual content.
## Context
This could be useful for projects that n…
-
Your code is licensed under Apache 2.0 which offers the most liberal license letting people use your software for even commercial purposes. But you have PyMuPDF as one of your requirements. This softw…
-
trying to install `pdf2docx`
- pymupdf = "nixpkgs" ;
---
`"refs/tags/3.4.0"`
using `python3.x` and `python3.xFull`
`The Package 'tkinter' is not available from any of the selected provider…
-
Hi,
please find bellow a less vulnerable docker setup as a improvement suggestion.
It reduces theproblem from this [8C, 34H, 32M, 98L Issues]:
..> docker scout quickview
![image](https://github.…
-
### Description of the bug
I want to remove all texts and only keep vector graphics (such as straight lines) in PDF, the code and result are shown below.
However, the original PDF does not contain…
-
Thanks for your great work! But it still has some problems. I have a PDF, which is not scanned(you can select the words in the files). When using your method, it will recognize 'benefit' as 'benets'. …
-
插件的宿主App 即 ChatGPT on Wechat的安装运行环境:DigitalOcean - App Platform 作为 App docker 运行于 容器内
安装插件时出错:
Installing collected packages: XlsxWriter, python-docx, PyMuPDFb, markdown, et-xmlfile, python-pptx, …
-
**Is your feature request related to a problem? Please describe.**
It is not currently supported to add an image with arbitrary transformation, which is needed when trying to rebuild a PDF from the…
-
### Description of the bug
When trying to create a pixmap for a PDF file, get_pixmap function takes too long even though the page size is well within the required limits.
```
pdf_in = fitz.open(…
-
**Describe the bug**
When using the coordinates of elements for bounding boxes, the coordinates are different using default strategy and 'hi_res' strategy.
**To Reproduce**
```
sudo apt-get inst…