issues
search
Unstructured-IO
/
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
9.2k
stars
764
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
chore: remove dev and release as 0.16.6
#3793
ryannikolaidis
closed
5 hours ago
0
Add timeout to scarf telemetry requests
#3792
Rojuinex
opened
7 hours ago
0
Telemetry request has no configured timeout
#3791
Rojuinex
opened
7 hours ago
0
fix(filetype): handle missing libmagic library
#3790
metadaddy
opened
9 hours ago
0
bug/<short-name>Unstructured Partition PDF , tesseract ERROR!!!
#3789
suhaif314
opened
16 hours ago
0
feat/Sharepoint Connector support for Entra-Id and Graph APIs
#3788
kkr2320
opened
1 day ago
0
bug/Enable a (global) way to set PIL.Image.MAX_IMAGE_PIXELS
#3787
cwang
opened
1 day ago
0
Prefer using provided filename over detection from file.name
#3786
framp
opened
2 days ago
0
bug/'General' object has no attribute 'partition_async'
#3785
alexander-zuev
closed
3 days ago
0
Define default HTML to ontology mapping
#3784
plutasnyy
closed
1 day ago
0
Bug: Neither hi_res nor fast partition strategies classify section title as Title
#3783
kubni
opened
3 days ago
6
Set <table> to be ontology.Table not UncategorizedText
#3782
plutasnyy
closed
6 days ago
0
bug/check for magic library availability doesn't appear to be correct
#3781
aaronsteers
opened
1 week ago
3
update scarf_analytics() GET request with timeouts
#3780
garyfanhku
opened
1 week ago
0
Add text as html to orig elements chunks
#3779
plutasnyy
closed
1 day ago
0
feat: repeat row headings in each table chunk
#3778
hardchor
opened
1 week ago
2
fixed pdf path error.
#3777
mzdz
opened
1 week ago
0
DEVX-722 : Support for multimodal loader using pytesseract
#3776
mogith-pn
closed
1 week ago
0
chore: remove dev and release as 0.16.5
#3775
badGarnet
closed
2 weeks ago
0
Fix extracting value from field
#3774
plutasnyy
closed
2 weeks ago
0
Add max recursion limit and fix to_text() method
#3773
plutasnyy
closed
2 weeks ago
0
bug/<short-name>
#3772
Minh7-byte
opened
2 weeks ago
0
build(deps): bump ruff from 0.4.10 to 0.7.2 in /requirements
#3771
dependabot[bot]
opened
2 weeks ago
0
build(deps): bump tqdm from 4.66.5 to 4.66.6 in /requirements
#3770
dependabot[bot]
opened
2 weeks ago
0
build(deps): bump anchore/scan-action from 3 to 5
#3769
dependabot[bot]
opened
2 weeks ago
0
build(deps): update botocore requirement from <1.34.132 to <1.35.54 in /requirements
#3768
dependabot[bot]
opened
2 weeks ago
0
build(deps): bump paddlepaddle from 3.0.0b1 to 3.0.0b2 in /requirements
#3767
dependabot[bot]
opened
2 weeks ago
0
feat/config for codecov
#3766
raiderrobert
closed
3 weeks ago
0
feat: support pdf link extraction in hi_res strategy <- Ingest test fixtures update
#3765
ryannikolaidis
closed
3 weeks ago
0
chore: pin unstructured-ingest
#3764
ryannikolaidis
closed
3 weeks ago
0
legacy office doc type conversion is not thread-safe in a container setup with Rocky Linux (potentially in general)
#3763
cwang
opened
3 weeks ago
5
Ml 415/merge inline elements <- Ingest test fixtures update
#3762
ryannikolaidis
closed
3 weeks ago
0
feat: support pdf link extraction in hi_res strategy <- Ingest test fixtures update
#3761
ryannikolaidis
closed
3 weeks ago
0
feat: support pdf link extraction in hi_res strategy <- Ingest test fixtures update
#3760
ryannikolaidis
closed
3 weeks ago
0
BadZipFile error when ran on AWS lambda
#3759
pastram-i
opened
3 weeks ago
5
ML-405/ML-427 - OntologyElement improvements
#3758
MaksOpp
closed
3 weeks ago
0
build(deps): bump ruff from 0.4.10 to 0.7.1 in /requirements
#3757
dependabot[bot]
closed
2 weeks ago
1
bug/execution gets stuck
#3756
jjovalle99
opened
3 weeks ago
1
release: version 0.16.3
#3755
tbs17
closed
3 weeks ago
0
Fix layout parsing
#3754
plutasnyy
closed
3 weeks ago
0
feat: support pdf link extraction in hi_res strategy
#3753
christinestraub
closed
3 weeks ago
0
Fix when parent id is none for first element in v2 notion:
#3752
plutasnyy
closed
3 weeks ago
0
Fix when parent id is none for first element in v2 notion:
#3751
plutasnyy
closed
4 weeks ago
0
fix: update python3.11 everywhere
#3750
yuming-long
closed
4 weeks ago
0
Ml 415/merge inline elements
#3749
plutasnyy
closed
3 weeks ago
1
set version 0.16.2
#3748
mariannaparzych
closed
4 weeks ago
0
Ml 384/whitespaces in cct
#3747
mariannaparzych
closed
4 weeks ago
0
fix: fix partition_via_api retry mechanism when the default SDK's retry config is empty.
#3746
pawel-kmiecik
closed
4 weeks ago
0
Set version to 0.16.1
#3745
plutasnyy
closed
4 weeks ago
0
build(deps): bump ruff from 0.4.10 to 0.7.0 in /requirements
#3744
dependabot[bot]
closed
3 weeks ago
1
Next