-
It's currently impractical for data parsed from PDFs to be imported to the Database.
A json-like (using python objects) format should be chosen, and then a decoder created.
# Format
A format simi…
-
# Bug report
When returning a buffer in an edge function with type application/pdf, the supabase JS client does not parse the result correctly. It is parsed as a string, but not as a blob.
It is…
-
It's more a question then an issue:
is it possible to parse tags from the document? I think about the marks for headings, tables etc.
-
Support extracting numbers from a PDF document
-
### Self Checks
- [X] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
- [X] I confirm that I am using English to su…
-
- PHP Version: 8.2
- PDFParser Version: 3 (the last from github)
### Description:
Hello
Very strange characters are returned by parsing bank detail PDF
### PDF input
[Releve_com…
-
The collab notebook example has a command that is no longer supported in magic-pdf:
```
!echo "{}" > ~/magic-pdf.json
!magic-pdf pdf-command --pdf "1706.03762.pdf" --model "output/1706.03762.js…
-
Hi, When I use `partition_type(file=io.BytesIO(file.file.read()),languages=["chi_sim"])` to parse Chinese pdf documents, I found the result was to split the paragraph text into a line text as a elemet…
-
I tried parsing PDFs today but GROBID seems to leave the author affiliation out for every document.
I used Docker with the GROBID DL model (0.8.1-name-address) and did not specify a consolidation …
-
### Municipality / Region
Hungary/Nyíregyháza
### Collection Calendar Webpage
https://www.eakhulladek.hu//calendar/get.php?d=2024_2284
### Example Address
Nyíregyháza
### Collection Data Format
…