issues
search
Unstructured-IO
/
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.37k
stars
572
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
chore: bump unstructured-inference 0.7.36
#3275
christinestraub
closed
4 days ago
1
Feat/pass down strategy to partition ppt as well
#3274
badGarnet
closed
1 week ago
1
fix(auto): partition() passes strategy to PPTX,DOCX
#3273
scanny
closed
1 week ago
1
build: fix amd64 image hash
#3272
MthwRobinson
closed
1 week ago
0
feat/code-snippets-context
#3271
asm0dey
opened
1 week ago
2
fix: update base image SHA for amd64 wolfi
#3270
christinestraub
closed
1 week ago
0
WIP: Ml 89/od metrics
#3269
mariannaparzych
opened
1 week ago
0
build: switch arm64 image to wolfi-base
#3268
MthwRobinson
closed
1 week ago
4
Compatibility Issue with Chinese Text in Document Parsing
#3267
JIAQIA
opened
1 week ago
1
feat/migrate gdrive source connector <- Ingest test fixtures update
#3266
ryannikolaidis
closed
1 week ago
0
Is there a way to convert text files to markdown format ?
#3265
shamanez
closed
1 week ago
5
Roman/bugfix conflicting event loop ingest
#3264
rbiseck3
closed
4 days ago
0
Fix missing sensitive fields for embedders
#3263
vangheem
closed
4 days ago
0
bug/<tables getting cut off at the edges when using hi res strategy>
#3262
rchen19
opened
1 week ago
1
feat/migrate gdrive source connector <- Ingest test fixtures update
#3261
ryannikolaidis
closed
1 week ago
0
Connection Error
#3260
ishansuhail
opened
1 week ago
3
build: version bump for release 0.14.7
#3259
MthwRobinson
closed
1 week ago
0
Feat: Add-rc-locator-to-partition-excel
#3258
marctorsoc
opened
1 week ago
0
rfctr(html): prepare for new html parser
#3257
scanny
closed
1 week ago
1
BUG - PPTX doesn't recognize text within slide notes
#3256
veredmm
closed
1 week ago
2
bug(html): invisible links are reported in metadata
#3255
scanny
opened
1 week ago
0
chore: Add markdown table support to Table element constructor
#3254
oguzhan1907
closed
1 week ago
0
partition_pdf got TypeError: UnstructuredTableTransformerModel.predict() got an unexpected keyword argument 'result_format'
#3253
liyang79
opened
1 week ago
6
bug/partition-pdf-with-infer_table_structure
#3252
DeepKariaX
closed
3 days ago
12
null <- Ingest test fixtures update
#3251
ryannikolaidis
closed
1 week ago
0
bug - duplicates merged cell text following issue #2106
#3250
veredmm
opened
1 week ago
2
rfctr(html): replace html parser <- Ingest test fixtures update
#3249
ryannikolaidis
closed
1 week ago
0
feat(pptx): add coordinate metadata to PPTX elements
#3248
scanny
opened
1 week ago
0
bug(html): form and form controls are not ignored
#3247
scanny
opened
1 week ago
3
fix: fix `IndexError` when partioning a pdf with `starting_page_number`
#3246
awalker4
closed
1 week ago
0
bug(html): distinct paragraphs within <li> are squashed into single element
#3245
scanny
opened
1 week ago
0
partition pdf, doc and pptx doesn't work for file bytes
#3244
sixftninja
closed
1 week ago
2
Process Attachments available via Paid API
#3243
jeremydiba
opened
1 week ago
0
feat/migrate gdrive source connector <- Ingest test fixtures update
#3242
ryannikolaidis
closed
1 week ago
0
GPU
#3241
Deh-alba
opened
1 week ago
0
fix: docker image publishing error <- Ingest test fixtures update
#3240
ryannikolaidis
closed
1 week ago
0
feat/migrate gdrive source connector
#3239
rbiseck3
closed
3 days ago
0
fix: docker image publishing error
#3238
christinestraub
closed
1 week ago
0
bug(html): empty <li> element produces ListItem with no text
#3237
scanny
opened
1 week ago
0
feat/include_location_for_xlsx_partition
#3236
marctorsoc
opened
1 week ago
0
feat/bbox_scaling_parameter
#3235
LesykDev
opened
1 week ago
2
feat: enhance analysis options with od model dump and better vis
#3234
pawel-kmiecik
closed
2 days ago
2
feat: expose converters deckerd -> html and back
#3233
pawel-kmiecik
closed
1 week ago
0
bug/right2left_pdf_output
#3232
DsDastgheib
opened
1 week ago
0
feat/Add page range to partition functions
#3231
ChiNoel-osu
opened
1 week ago
0
bug(html): source-formatting whitespace appears in link_texts metadata and Element text
#3230
scanny
opened
1 week ago
0
bug(html): nested lists are squashed
#3229
scanny
opened
1 week ago
0
bug(html): <div> with both text and phrasing child breaks element at phrasing child
#3228
scanny
opened
1 week ago
0
bug(html): <br/> element breaks paragraph/document-element
#3227
scanny
opened
1 week ago
0
null <- Ingest test fixtures update
#3226
ryannikolaidis
closed
1 week ago
0
Previous
Next