yobix-ai extractous issues

yobix-ai / extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

Apache License 2.0

411 stars 17 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

markdown support

#37 peterlyz opened 4 hours ago
0
Fix anchor links in README.md

#36 yutannihilation opened 1 day ago
0
Support for Extracting PDF Content as XML

#35 coroluca opened 2 days ago
1
use it in multiple processes.

#34 ljhssga opened 2 days ago
1
Failed Extraction - cmap font missing

#33 s4zuk3 opened 5 days ago
2
Failed extraction - Class CTTextCharacterProperties is missing.

#32 s4zuk3 opened 5 days ago
2
Installation not working - WIndows 11/Python3.10

#31 IneffableBunch opened 6 days ago
1
Change in PDF Extraction Results

#30 TheTechromancer closed 6 days ago
3
change extractor api to return tuple of result and metadata

#29 nmammeri closed 1 week ago
0
fix: update tika_native dir in build folder

#28 KapiWow closed 1 day ago
0
25 make reflection data platform specific

#27 nmammeri closed 1 week ago
0
Feature/3 return tika metadata

#26 s4zuk3 closed 1 week ago
3
make reflection data platform specific

#25 nmammeri closed 1 week ago
0
Tika Metadata - HashMap Issue

#24 s4zuk3 closed 1 week ago
6
Stall when extracting using ocr on macos from pdf with embedded images

#23 nmammeri opened 2 weeks ago
1
7 implement extracting from an array of bytes

#22 nmammeri closed 1 week ago
0
Draft: Implement extracting from an array of bytes

#21 KapiWow closed 2 weeks ago
1
Test Multiple Python Versions (+3.13 Support)

#20 TheTechromancer opened 2 weeks ago
7
18 ocr examples and docs

#19 nmammeri closed 3 weeks ago
0
ocr examples and docs

#18 nmammeri closed 2 weeks ago
0
fix: fixed issue 16 and added test case

#17 nmammeri closed 3 weeks ago
0
TypeError: ParseError("Parse error occurred : TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@281b1a01")

#16 NourEldin-Osama closed 3 weeks ago
0
1 add microsoft windows support

#15 nmammeri closed 3 weeks ago
0
Draft: windows support

#14 KapiWow closed 3 weeks ago
0
failed to install in windows 11

#13 NourEldin-Osama closed 3 weeks ago
9
build: don't rebuild graalvm libs if were built before

#12 nmammeri closed 1 month ago
0
ISSUE#3: Implemented Tika Metadata

#11 s4zuk3 closed 2 weeks ago
5
PyPI package is huge

#10 chrisgoddard closed 1 month ago
2
make the build script faster

#9 nmammeri closed 1 month ago
0
tests: Tests with different file formats

#8 KapiWow closed 1 month ago
1
Implement extracting from an array of bytes

#7 nmammeri closed 1 week ago
0
Extracting text from a specific page of the document

#6 bm777 closed 1 month ago
4
Improve extract to stream performance

#5 nmammeri opened 2 months ago
0
Add detect file type API

#4 nmammeri opened 2 months ago
0
Return Metadata with extraction result

#3 nmammeri closed 1 week ago
0
Add tests with different file formats

#2 nmammeri closed 1 month ago
0
Add Microsoft Windows support

#1 nmammeri closed 3 weeks ago
0