issues
search
Unstructured-IO
/
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
9.21k
stars
764
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
chore: pin upper limit for unstructured-client
#3743
badGarnet
closed
1 month ago
0
bump requirements; especially inference <- Ingest test fixtures update
#3742
ryannikolaidis
closed
1 month ago
0
test: just upgrade others without bump inference
#3741
badGarnet
closed
1 month ago
0
feat: round numbers to reduce undeterministic behavior
#3740
badGarnet
closed
1 month ago
0
feat: LanceDB integration
#3739
PrashantDixit0
opened
1 month ago
2
rfctr(csv): minify HTML and table text is cct <- Ingest test fixtures update
#3738
ryannikolaidis
opened
1 month ago
0
rfctr(csv): minify HTML and table text is cct <- Ingest test fixtures update
#3737
ryannikolaidis
closed
1 month ago
0
rfctr(csv): minify HTML and table text is cct <- Ingest test fixtures update
#3736
ryannikolaidis
closed
1 month ago
0
bump requirements; especially inference <- Ingest test fixtures update
#3735
ryannikolaidis
closed
1 month ago
0
rfctr(pptx): minify HTML and table.text is cct
#3734
scanny
closed
1 month ago
0
rfctr(csv): minify HTML and table text is cct
#3733
scanny
closed
1 month ago
0
Add parsing HTML to unstructured elements
#3732
plutasnyy
closed
4 weeks ago
0
bug/unable to instantiate HuggingFaceEmbeddingEncoder from unstructured.embed.huggingface
#3731
mattseddon
opened
1 month ago
0
remove abstract decorator from initialize in BaseEmbeddingEncoder
#3730
mattseddon
opened
1 month ago
4
chore: pip-compile with python3.10
#3729
badGarnet
closed
1 month ago
0
Remove unsupported chipper model
#3728
vangheem
closed
1 month ago
1
bug/file timeout when partition
#3727
AustinZzx
opened
1 month ago
1
Minio + Unstructured + Weaviate
#3726
naelsen
opened
1 month ago
1
bug/Names of interface elements in text output after partition
#3725
SlawaLoev-KSO
opened
1 month ago
4
feat: expose retry params in partition via api
#3724
pawel-kmiecik
closed
1 month ago
0
rfctr(email): eml partitioner rewrite <- Ingest test fixtures update
#3723
ryannikolaidis
closed
1 month ago
0
build(deps): bump ruff from 0.4.10 to 0.6.9 in /requirements
#3722
dependabot[bot]
closed
1 month ago
1
Add password with PDF files
#3721
pprados
opened
1 month ago
4
feat: update elements merging order in pdf partition
#3720
christinestraub
closed
4 weeks ago
0
请问解析doc或者docx 是否可以增加图片标签
#3719
sph116
opened
1 month ago
1
broken inference source code for 'hi_res', AttributeError: 'list' object has no attribute 'element_coords', the same code worked with previous versions of unstructured
#3718
Arslan-Mehmood1
opened
1 month ago
14
bug/partition_html解析时会删除html 表格
#3717
deku0818
opened
1 month ago
1
Fix typing issue in inference_utils.py
#3716
cckolon
opened
1 month ago
1
fix(auto): quick fix for auto test failing in CI
#3715
scanny
closed
1 month ago
0
#3713 fix the wrong file path in README.md
#3714
shaofengshi
opened
1 month ago
0
bug/wong-example-file-path-in-readme: Get "No such file or directory" error by following the steps in Readme
#3713
shaofengshi
opened
1 month ago
2
build: Fix build reproducibility.
#3712
jsirois
opened
1 month ago
1
bump `unstructured-inference`
#3711
badGarnet
closed
1 month ago
0
feat/remove ingest code, use new dep for tests <- Ingest test fixtures update
#3710
ryannikolaidis
closed
1 month ago
0
build(release): release commit for 0.15.14
#3709
christinestraub
closed
1 month ago
1
bug/reading html file returns empty list
#3708
lwollenbergfuzzy
opened
1 month ago
6
bug/Extract ppt failed by api
#3707
JohnJyong
opened
1 month ago
4
feat/remove ingest code, use new dep for tests <- Ingest test fixtures update
#3706
ryannikolaidis
closed
1 month ago
0
feat/remove ingest code, use new dep for tests <- Ingest test fixtures update
#3705
ryannikolaidis
closed
1 month ago
0
bug/pypi-incomplete source on PyPI for unstructured (tested 0.5.x)
#3704
petrklus
opened
1 month ago
1
[DO NOT MERGE] Clone/DavidBlore - fix: add language to OCRAgentGoogleVision constructor
#3703
christinestraub
closed
1 month ago
0
Simplify Element type by use of Pydantic?
#3702
ctrahey
opened
1 month ago
1
rfctr(ppt): remove double-decoration
#3701
scanny
closed
1 month ago
0
add tests describing the behavior of set_element_hierarchy
#3700
Coniferish
closed
1 month ago
0
feat/<short-name>Writing back the unstructured extracted partitions to the same file format
#3699
SinaRanjkeshzade
closed
1 month ago
1
feat/option to load extraction models once instead of everytime partition pdf function called
#3698
hasansalimkanmaz
opened
1 month ago
2
bug/certain htmls cannot be parsed
#3697
AraiYuno
opened
1 month ago
5
fix: add `language` to `OCRAgentGoogleVision` constructor
#3696
DavidBlore
closed
1 month ago
1
DO NOT MERGE: CI test run only <- Ingest test fixtures update
#3695
ryannikolaidis
closed
1 month ago
0
rfctr(email): eml partitioner rewrite
#3694
scanny
closed
1 month ago
0
Previous
Next