-
### Is there an existing issue for the same bug?
- [X] I have checked the existing issues.
### Branch name
main
### Commit ID
bef1bbdf3e16e5163bc563407bd7fd8f7da97d7a
### Other environment infor…
-
Uploading a small PDF appears to succeed (no errors reported) but UI doesn't reflect the uploaded file and it can't be queried.
Running openAI settings, here are the ingestion results:
`09:43:5…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I am parsing a pdf file and extracting information from it, i want to have a source_link…
-
The current pdf library leaves a lot to be desired.
It only works for simple pdfs with plain images And text.
Anything more complex that has graphs, charts, etc, comes through very poorly.
One i…
-
Hi, I got the following error when testing the library on a pdf exploit generated by Metasploit module `exploit/windows/fileformat/adobe_pdf_embedded_exe_nojs`: `NameError: name 'name' is not defined`…
-
## Description
When using Marker to extract Chinese characters from some PDF documents, some characters are not extracted at all, while others are extracted as garbled text. Below are three example f…
-
I am working on parsing a pdf form. The pdfparse functions work - my document is parsed. However, I only receive the text surrounding the form fields. I have entered data in the form fields and saved …
-
While working in the online preview, I get this error when running it on Jupyter notebook:
`Error while parsing the PDF file: Failed to parse the PDF file: {"detail":[{"loc":["body","language",0],"…
-
It's more a question then an issue:
is it possible to parse tags from the document? I think about the marks for headings, tables etc.
-
## Dev Effort
1D
## Description
The `Title` field of [this document](http://wiki.opf-labs.org/download/attachments/101613571/SIP110204_ReColl-124480_1-s2.0-S0370269317301144-main.pdf)'s informati…