issues
search
deanmalmgren
/
textract
extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k
stars
599
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Scheduled biweekly dependency update for week 20
#334
pyup-bot
closed
4 years ago
1
The extension .pptx is not supported. I know ppt is not but why .pptx? In the available extensions it shows .pptx but it is not working.
#333
anirudhpnbb
closed
4 years ago
1
Scheduled biweekly dependency update for week 18
#332
pyup-bot
closed
4 years ago
1
Scheduled biweekly dependency update for week 16
#331
pyup-bot
closed
4 years ago
1
Different result compared to when extracting directly with pdftotext
#330
filipopo
opened
4 years ago
0
Scheduled biweekly dependency update for week 11
#329
pyup-bot
closed
4 years ago
1
Scheduled biweekly dependency update for week 09
#328
pyup-bot
closed
4 years ago
1
epub parser: separate text blocks of logical elements by "Form Feed"
#327
workflowsguy
opened
4 years ago
0
Fix simple typo: undesireable -> undesirable
#326
timgates42
closed
3 years ago
0
Fix simple typo: undesireable -> undesirable
#325
timgates42
closed
3 years ago
0
requirements/python - pdfminer.six dependency update
#324
xchek
opened
4 years ago
6
Fix parsing of ePub's which sometimes fails due to item content being empty inside of the ePub file
#323
sr-rolando
closed
3 years ago
0
Scheduled biweekly dependency update for week 07
#322
pyup-bot
closed
4 years ago
1
Pdfminer and Tesseract not found
#321
ObitoSigma
opened
4 years ago
3
Scheduled biweekly dependency update for week 05
#320
pyup-bot
closed
4 years ago
1
Extract text from image "ticket"
#319
hanane2019
closed
4 years ago
1
Unable to extract table from pdf in right format
#318
BVSREDDY82
closed
4 years ago
4
Is there any chance to install this in AWS lambda ?
#317
adantart
opened
4 years ago
4
recursion error
#316
baditaflorin
opened
4 years ago
1
Support kwargs for external parsers
#315
MezentsevIlya
closed
4 years ago
3
Add kwargs processing for xlsx parser
#314
MezentsevIlya
closed
4 years ago
1
command line interface is broken on windows
#313
KamarajuKusumanchi
opened
4 years ago
5
Scheduled biweekly dependency update for week 44
#312
pyup-bot
closed
4 years ago
1
ShellError: The command `pdftotext علی.pdf -` failed with exit code 1
#311
Keramatfar
closed
4 years ago
1
Scheduled biweekly dependency update for week 42
#310
pyup-bot
closed
4 years ago
1
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 40
#309
ghost
opened
5 years ago
5
.ods support
#308
quinten-goens
opened
5 years ago
2
Missing carriage returns in PDF
#307
Overdrivr
opened
5 years ago
5
Scheduled biweekly dependency update for week 40
#306
pyup-bot
closed
5 years ago
1
numbers ignored
#305
sylwiaoz
closed
5 years ago
9
Letters with diacritic left out.
#304
AdreamCZ
opened
5 years ago
8
Scheduled biweekly dependency update for week 37
#303
pyup-bot
closed
5 years ago
1
Scheduled biweekly dependency update for week 35
#302
pyup-bot
closed
5 years ago
1
Hi everyone, cant install textract to jupyter, Anaconda, Windows 10
#301
Schrodinger-cat-kz
closed
5 years ago
17
Extract information from bytes
#300
asciidiego
opened
5 years ago
6
Scheduled biweekly dependency update for week 33
#299
pyup-bot
closed
5 years ago
2
not able extract text from file using python package
#298
swamyaddala
closed
5 years ago
7
The command `antiword' failed because the executable `antiword` is not installed on your system
#297
amjadparacha
closed
5 years ago
6
Sphinx supports python 3.5+
#296
jpweytjens
closed
5 years ago
0
pocketsphix installation error on windows
#295
kapilg1997
closed
5 years ago
2
Release 1.6.2 on PyPI
#294
kennell
closed
5 years ago
1
create new console with no new window if in win32
#293
Pandaaaa906
opened
5 years ago
1
update requirements
#292
jpweytjens
closed
5 years ago
0
Cannot install textract from pypi
#291
sharath-psh
closed
5 years ago
2
Scheduled biweekly dependency update for week 26
#290
pyup-bot
closed
5 years ago
0
Is textract still being maintained?
#289
bruot
closed
5 years ago
4
Passing additional arguments to underlying library (e.g. antiword)
#288
marcelo-dalmeida
opened
5 years ago
0
Scheduled biweekly dependency update for week 24
#287
pyup-bot
closed
5 years ago
1
Update req
#286
jpweytjens
closed
5 years ago
0
Change error mode to "ignore" and decrease reliance on chardet.detect() in decoding
#285
mevers303
closed
3 years ago
5
Previous
Next