issues
search
nlmatics
/
nlm-ingestor
This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
https://www.nlmatics.com
Apache License 2.0
912
stars
111
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
llm sherpa deployed in eks cluster with 4vcpu and 16gb ram not working properly
#74
gireesh99
opened
2 days ago
1
Update version in setup.py, and enable unpinning version of pandas.
#73
michaelfeil
opened
2 weeks ago
0
KeyError: 'style'
#72
RaphSte
opened
3 weeks ago
13
Update docker-publish.yml
#71
ansukla
closed
3 weeks ago
0
Update to nlm 2.9.2 v2
#70
jamesvillarrubia
opened
3 weeks ago
1
Docker expose port needs to be corrected
#69
kjoth
opened
3 weeks ago
0
Updates tika jar to 2.9.2
#68
jamesvillarrubia
closed
3 weeks ago
2
Connection error with Docker run
#67
kjoth
opened
3 weeks ago
0
BBOX information
#66
TheMrguiller
opened
1 month ago
0
Latest docker image not working locally
#65
mvennela
opened
1 month ago
1
Download encodings during build to run Docker image offline
#64
mgl
closed
3 weeks ago
0
Missing Tests?
#63
ramarnat
opened
1 month ago
0
Add the default config file for tika
#62
kiran-nlmatics
closed
1 month ago
0
Receiving an error 'urllib3.exceptions.LocationValueError: No host specified.'
#61
anirudh-gapblue
opened
1 month ago
0
Expose port
#60
jinkjonks
closed
3 weeks ago
0
Health checks fail because port 5001 not exposed by default
#59
jinkjonks
closed
3 weeks ago
0
How to deploy this thing in production building image with the docker file giving error.
#58
aman-vink
closed
2 months ago
1
Is it possible to run this fully local, so sensitive PII PDFs dont leave the network?
#57
AIMads
opened
2 months ago
1
For anyone hoping to deploy this as a lambda
#56
dgonier
opened
2 months ago
1
Lost pages
#55
sailxjx
opened
2 months ago
0
Correct the BBOX for table blocks
#54
kiran-nlmatics
closed
2 months ago
0
fix: bbox error in block renderer
#53
livelxw
closed
2 months ago
0
bbox error in BlockRender
#52
livelxw
closed
2 months ago
2
Trivially small chunks returned
#51
thelazydogsback
opened
2 months ago
0
&applyOcr=yes - no OCR taking place (skipping image pages)
#50
thelazydogsback
opened
2 months ago
3
docker image is not producing any result
#49
craldaz
opened
2 months ago
1
Bug
#48
aman-vink
opened
2 months ago
4
Update __main__.py
#47
moveyor
closed
2 months ago
0
Dependency versions too strict
#46
choyuansu
opened
2 months ago
0
Issue with finding tables and sections
#45
Aviral-tech
opened
3 months ago
0
Error when parsing a PDF
#44
kaulshashank
opened
3 months ago
2
Fix table left KeyError
#43
jinhy-sequoiacap
closed
3 months ago
1
Question: Is it possible to retrieve the pdf position (bbox) for table rows
#42
janwbouma
opened
3 months ago
0
box_style not being taken into account
#41
mikecook69
opened
3 months ago
2
made changes to integrate with indexer
#40
ansukla
closed
3 months ago
0
nlm-ingestor is SUPER SLOW
#39
pashpashpash
opened
3 months ago
4
Disable rules/paranthesized header
#38
mikecook69
opened
3 months ago
0
Suggestions for Fast Production Server
#37
yashpatel21
opened
3 months ago
5
[PDF Ingestor] make sure key idx within the range of sorted freq keys
#36
baobo5625
closed
3 months ago
0
Can you provide guidance on when page_idx wouldn't be available?
#35
chrismaresca
opened
3 months ago
0
Unable to finish setup of nlm-ingestor due to missing distutils module
#34
lukenas
opened
3 months ago
0
Encoding error with non-ASCII character.
#33
jamesvillarrubia
opened
3 months ago
2
PDF extraction
#32
Amy-raj
opened
3 months ago
1
Docker file available for hosting into lambda as container?
#31
akayalEC
opened
4 months ago
0
Not able to install nlm_ingestor
#30
sli701
opened
4 months ago
3
memory leaks
#29
ZengJin123
closed
2 months ago
1
.pages files are chunked correctly but page_idx is always 0
#28
pashpashpash
opened
4 months ago
0
.pptx files are correctly chunks, but page_idx is always 0
#27
pashpashpash
opened
4 months ago
0
.doc files are correctly chunked, but page_idx is always 0
#26
pashpashpash
opened
4 months ago
0
HTLM AND XML INGESTOR
#25
drewskidang
opened
4 months ago
1
Next