-
After spending a lot of time creating a JSON schema by hand, things get rearranged as soon as you hit the "create" button on the demo site. This impairs the extraction process as the information in th…
1of13 updated
7 months ago
-
Trying to extract tabular data (table is embedded as an image) from a PDF file. While I've managed to extract some data, there are consistent errors when the table is located at the bottom of the PDF.…
-
### File Name
gemini/batch-prediction/intro_batch_prediction.ipynb
### What happened?
I'm using the Gemini Batch Prediction API with the Vertex AI Python SDK in a project. When attempting to run ba…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
@dosu what is the primarily difference MarkdownElementNodeParser and MarkdownNodeParser.…
-
Hi, I meet an issue when open pdf of some entries, even though the filename of it only contains ascii character. The following is an example:
```
@inproceedings{binmohdazir2017wrapper,
title = {W…
-
### Issue: Comparing GROBID and Docling for Parsing Scholarly Publications
#### **My Use Case**
We need to parse and extract all relevant information from (1000s) of scholarly publications, such…
-
### Description:
Create a text extraction module to extract and output the recognized text from the detected segments of the images.
### Tasks:
- Develop a method to extract the text from the i…
-
### 请提出你的问题
你好.
I want to use ERNIE-Layout in Korean.
1. PaddleOCR re-training korean dataset
2. Key Information Extraction(KIE) fine tuning in Korean Dataset
3. Document Question Answering(DQ…
-
Ok, blaming in Git it seems like that the problem described in https://github.com/NNPDF/yadism/pull/161#issuecomment-1335716316 it is my fault, but something I did one year ago.
This means that `ya…
-
Add this extraction
2024-11-03 19:43:33.992 Warning: Slow write /var/opt/MarkLogic/Forests/modules-1/DiskCheck, 255 B in 3.249 sec
2024-11-03 19:43:34.311 Warning: Slow open /var/opt/MarkLogic/For…