issues
search
Unstructured-IO
/
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.4k
stars
573
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
bug/Error for large pdf files
#3329
magallardo
opened
7 minutes ago
0
fix: update slack test to point to new channel
#3328
potter-potter
opened
1 hour ago
0
bugfix/isolate ingest v2 dependencies
#3327
rbiseck3
opened
2 hours ago
0
CPU only installation
#3326
arthurbrenno
opened
1 day ago
1
bug/Two Column PDF partition result in incorrect text.
#3325
pfcharles
opened
2 days ago
2
bug/pip install failure for `"unstructured[all-docs]"`
#3324
jacoblee93
opened
2 days ago
1
rfctr: Implement SQL V2 Dest Connector
#3323
vangheem
opened
2 days ago
0
docs: add link to serverless api in readme
#3322
MthwRobinson
closed
4 hours ago
0
feat/migrate sharepoint src <- Ingest test fixtures update
#3321
ryannikolaidis
closed
3 days ago
0
feat/singlestore dest connector
#3320
rbiseck3
opened
3 days ago
1
bug/missing `psutil` in 0.14.9
#3319
anakin87
opened
3 days ago
0
Asynchronous Compatibility of File Loader
#3318
haoshan98
opened
3 days ago
0
fix(docx): refine file-not-found vs not-DOCX
#3317
scanny
closed
17 hours ago
1
feat/migrate sharepoint src <- Ingest test fixtures update
#3316
ryannikolaidis
closed
3 days ago
0
Fix blocking async interfaces for v2 ingest framework
#3315
vangheem
opened
3 days ago
1
feat/migrate sharepoint src
#3314
rbiseck3
opened
3 days ago
1
rfctr: implement mongodb v2 destination connector
#3313
vangheem
opened
3 days ago
2
release: version 0.14.9
#3312
MthwRobinson
closed
3 days ago
0
feat: add v2 azure cognitive search destination connector
#3311
ahmetmeleq
opened
3 days ago
0
build: image and dependency updates; fix tesseract files locations
#3310
MthwRobinson
opened
3 days ago
3
bug/quotes from markdown are stripped out
#3309
gaspardpetit
opened
4 days ago
4
build(deps): bump langchain-community version <- Ingest test fixtures update
#3308
ryannikolaidis
closed
4 days ago
0
feat/migrate onedrive src <- Ingest test fixtures update
#3307
ryannikolaidis
closed
4 days ago
0
feat(docx): differentiate no-file from not-ZIP
#3306
scanny
closed
4 days ago
0
build(deps): bump langchain-community version
#3305
MthwRobinson
closed
4 days ago
0
feat: add Astra source connector
#3304
potter-potter
opened
4 days ago
0
Bugfix/ingest pipeline check
#3303
rbiseck3
closed
3 days ago
0
rfctr [P6M-397]: opensearch source connector v2
#3302
potter-potter
opened
5 days ago
0
feat/more conservative ingest logging
#3301
rbiseck3
closed
4 days ago
0
Fix not counting false negatives and false positives in table metrics
#3300
plutasnyy
opened
5 days ago
1
feat/extract_pdf_page_images
#3299
huanji1987
opened
5 days ago
0
revert unstructured-client pin and make pip-compile
#3298
Coniferish
opened
5 days ago
0
feat/migrate onedrive src <- Ingest test fixtures update
#3297
ryannikolaidis
closed
5 days ago
0
build: move numpy pin to packaging
#3296
qued
closed
5 days ago
0
feat/migrate onedrive src
#3295
rbiseck3
closed
4 days ago
2
feat/migrate astra db
#3294
rbiseck3
closed
5 days ago
1
rfct [P6M]-392: OpenSearch V2 Destination Connector
#3293
potter-potter
closed
2 days ago
0
Couchbase vector store support as destination and source connector
#3292
lokesh-couchbase
opened
6 days ago
1
File parsing CPU cores
#3291
shuaihutianxie
opened
6 days ago
0
bug/docker_tesseract_missing
#3290
neilkumar
opened
6 days ago
6
bug/poetry-in-dockerfile
#3289
MattiaCinelli
opened
6 days ago
1
Clean up warning table transformer warning statements statements
#3288
magallardo
opened
6 days ago
7
fix: wait to run soffice until there is no other soffice process running
#3287
badGarnet
closed
5 days ago
0
feat: add v2 pinecone destination connector
#3286
ahmetmeleq
opened
6 days ago
4
feat/migrate gdrive source connector <- Ingest test fixtures update
#3285
ryannikolaidis
closed
6 days ago
0
bug/<Ingestion error to process attachments for .msg files>
#3284
mahmoudaymo
closed
5 days ago
5
bug/<short-name>
#3283
rs-03
closed
6 days ago
2
fix: add arch into build images
#3282
MthwRobinson
closed
1 week ago
1
Return image data from confluence
#3281
ML-Abdula
opened
1 week ago
4
List block in a partitioned Markdown doc identified as a `Title` element under special conditions
#3280
nickphilip
opened
1 week ago
1
Next