-
### **Command Executed**
```bash
python scripts/ocr.py --config=configs/ocr.yaml
```
---
### **Logs**
Below are the logs leading up to the issue:
```/Users/mukesh.kumar/Library/Python/3.11/…
-
The provided datasets have four variants, each serving a specific purpose, and contain a `text_description` as described below E.g gov:
1. **syntheticDocQA_government_reports_test** – **No text_des…
-
### Bug
In case of tables where most of the columns are empty and one column is completely filled, the table that docling extracts truncates the filled column values.
### Steps to reproduce
I ha…
-
```
[](https://localhost:8080/#) in extract_data_from_pdf(pdf_path)
57 # Function to extract text using the unstructured library
58 def extract_data_from_pdf(pdf_path):
---> 59 eleme…
-
python 3.8
1.1 安装PaddlePaddle
paddlepaddle-gpu==3.0.0b2
1.2.1 通过whl包安装与运行
Windows
pip install PPOCRLabel # 安装
# 选择标签模式来启动
PPOCRLabel --lang ch # 启动【普通模式】,用于打【检测+识别】场景的标签
报错如下:
Traceback (most…
-
### Requested feature
Handling image with OCR the same way the PDF pipeline does.
What would i take to implement something like this ? Is this possible or not due to some reasons ? I can help wi…
-
I'm under the impression that, on running the provided js code, the output is either the result of OCR, or an interpretation on what is on the provided image. Is there a way to tell it which to do spe…
-
Hi
With Python 3.12 the class RapidTable missing this atributtes:
'get_boxes_recs',
'load_img',
'ocr_engine',
With Python 3.12
table_engine = RapidTable(model_path='./models/table/en_…
-
### SDK
Python
### Description
```python
import lancedb
import pyarrow.parquet as pq
# Connect to LanceDB
db = lancedb.connect("data/cases2.lance")
print(db.table_names())
data = pq.read_…
-
### Question
I just ran a test with my PDF document and found that an error was reported and the output document was empty. Here's a snippet of my code:
input_doc_path = "C:\\Users\\lenovo\\Down…