-
# Description
## User Stories
* > As a teacher who frequently works with PDFs, I would like to have a quick way to create PDFs so that I don't have to open Collabora/OnlyOffice etc. to create PD…
-
### Bug
Tested on Dolibarr 16 / 19 / dev
Nginx and PHP 7.4 and 8.2 (for 19 and dev)
With Libreoffice 24 installed, the ODT generated via Dolibarr won't convert into PDF.
When opening the ODT…
-
Requesting a version of PDF OCR that only runs tesseract OCR on embedded images in PDF instead of capturing the whole page of the PDF.
A lot of my professors use powerpoints converted to PDF, the t…
-
Some existing tool such as pandoc (which supports many other formats, but for now I would limit to pdf) can be used. If the feature is of interest I could try to implement it
-
2 options pour le pdf :
1. Conversion quoiqu'il arrive pdf -> jpg avec pdf2image, puis paddleOcr.L'api renvoie les box de texte + le jpg au front pour l'affichage
2. Si le pdf contient du tex…
-
Is there not a way to upload JPEG? my converter wont let me convert pdf to JPG
-
### Bug
In case of tables where most of the columns are empty and one column is completely filled, the table that docling extracts truncates the filled column values.
### Steps to reproduce
I ha…
-
This might be an inkscape thing, but why, when an annotation has `EPset_Annotation.Classes = fill-bg` it is pixelated when the svg is converted to a pdf via Inkscape.
I know with `fill-bg` there's…
-
I am having a ligature issue with this PDF.
'fi', 'fl' and 'ff' characters are returning NULL
#598 is similar to this issue.
## MVCE: Code + PDF
```python
from PyPDF2 import PdfReader
r…
-
### Bug
On a Windows 11 installation using ARM64 CPU (UTM virtual machine on macOS host), docling silently crashes without generating output when a document is converted with OCR enabled (using def…