issues
search
CatchTheTornado
/
pdf-extract-api
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
https://demo.doctractor.com
GNU General Public License v3.0
1.33k
stars
85
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[feat] Check the QWEN-VL model as OCR provider
#44
pkarw
opened
22 hours ago
0
[feat] Add Python API client
#43
pkarw
opened
2 days ago
0
[fix] cache returned
#42
pkarw
closed
4 days ago
0
[feat] Test `pixtral` as a OCR strategy
#41
pkarw
opened
4 days ago
0
WiP: [feat] llama3.2_vision update
#40
pkarw
closed
4 days ago
0
[feat] #15 Add S3 storage strategy
#39
choinek
closed
4 days ago
2
Demo links + API client links
#38
pkarw
closed
1 week ago
0
[feat] support returning images
#37
pkarw
opened
1 week ago
0
[feat] online demo link
#36
pkarw
closed
1 week ago
0
Demo access
#35
pkarw
closed
1 week ago
0
Update README.md to remove that extra "`" in cloning .env codeblock
#34
hahouari
closed
1 week ago
2
Challenges with LLMs Not Respecting Provided Fields in JSON Outputs
#33
kreativitat
opened
1 week ago
4
Investigate and test pdf-extract-kit models
#32
pkarw
opened
1 week ago
0
[feat] #30 - new `/ocr/request` endpoint proposals and docs
#31
pkarw
closed
1 week ago
0
[feat] Add another endpoint where files could be send via JSON body not form fields
#30
pkarw
opened
1 week ago
0
[feat] Add embedding + vector database support
#29
pkarw
opened
1 week ago
0
OCR task failed.
#28
PoleGeogry
closed
1 week ago
2
[feat] Test and add Llama 3.2-vision as OCR strategy
#27
pkarw
closed
4 days ago
2
[feat] ChatGPT, Claude and other LLM strategies support
#26
pkarw
opened
2 weeks ago
0
[docs] how to run app locally without docker
#25
pkarw
closed
2 weeks ago
0
[feat] Related to #23 - describe how to run the app natively to support Apple GPUs etc
#24
pkarw
closed
2 weeks ago
2
Use Local Ollama Instance Instead of Docker-Compose Instance
#23
madhankumar2211
closed
2 weeks ago
4
Pdf
#22
andriystetsik
closed
2 weeks ago
0
docker部署,在fastapi中正常
#21
PoleGeogry
closed
2 weeks ago
0
Add MetaData LLM call
#20
chavan-arvind
opened
2 weeks ago
2
Add S3 storage strategy
#19
chavan-arvind
closed
2 weeks ago
0
Add S3 storage strategy
#18
chavan-arvind
closed
4 days ago
2
Bugfix to #11, #12, #13
#17
pkarw
closed
2 weeks ago
0
[feat] Add MetaData LLM call
#16
pkarw
opened
2 weeks ago
0
[feat] Add S3 storage strategy
#15
pkarw
closed
4 days ago
0
ollama healthcheck
#13
Marcelas751
closed
2 weeks ago
3
Pulling model from cli
#12
Marcelas751
closed
2 weeks ago
3
ocr curl
#11
Marcelas751
closed
2 weeks ago
3
Feat: #8 storage strategies - local file system + google drive
#10
pkarw
closed
1 week ago
0
Bugfix for #6 with CUDA - spawning the processes
#9
pkarw
closed
3 weeks ago
0
[feat] Add Storage module with different storage strategies
#8
pkarw
closed
1 week ago
2
Fix typo in README.md
#7
martwozniak
closed
3 weeks ago
0
Cannot re-initialize CUDA in forked subprocess
#6
Nasa1423
closed
2 weeks ago
5
[feat] Add support to `doc`, `rtf` and other formats
#5
pkarw
opened
3 weeks ago
0
[feat] Obsidian plugin
#4
pkarw
opened
3 weeks ago
1
[feat] Add support for `tabled`
#3
pkarw
opened
3 weeks ago
0
[feat] Add GPU support in `docker-compose`
#2
pkarw
closed
1 week ago
1
[feat] Support `sync` mode + streaming
#1
pkarw
opened
3 weeks ago
0