-
It would be nice to have an option to split multipage pdfs into multiple pdfs like convert but the other way around.
-
### Feature Description
- Detect chapters by finding and interpreting a table of contents.
- Split the source PDF into multiple PDF's: one per chapter. The table of contents should also have its own…
-
### Feature Description
Hi,
I use barcode sticker/stamp on the first page of my multi pages documents, and then scan the whole batch.
It would be awsome to have a tool allowing to split the PDF w…
-
目前版本(0.8.1)解析的pdf文档,如果是三栏布局,解析结果会存在段落错乱的问题,
![image](https://github.com/user-attachments/assets/55a6c6bc-6fa0-485a-a8fd-5dcc1237642d)
部分运行日志:
2024-09-14 10:20:35.811 | INFO | magic_pdf.model.…
-
This is likely a large problem with PDF parsing.
We have two xfails in the test suite that address this:
![image](https://github.com/user-attachments/assets/20e5d2b9-4e42-4081-995f-f993fef4531e)…
-
When trying to add metadata to an index, either using a list of metadata dicts or a mapping of uid to metadata dict (shown below), it always produces a key error.
Example:
```
RAG = RAGMultiMod…
-
See [this comment](https://github.com/Unstructured-IO/unstructured-js-client/issues/20#issuecomment-2199765184). If a pdf file does not have `.pdf` in the filename, we return the message `Given file i…
-
Hello, thank you for the library! It has been quite useful for merging PDFs.
I was wondering if we could please have an example of splitting PDF Pages into individual Documents. I've tried writin…
-
Works uploaded to the Adventist environment are supposed to render in PDF.js immediately, prior to splitting. This behavior works on production. On staging, an importer of 10 books from the OAI feed r…
-
Allow to split PDF in group of pages instead of one.
Clouz updated
1 month ago