issues
search
deanmalmgren
/
textract
extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k
stars
599
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Scheduled biweekly dependency update for week 27
#384
pyup-bot
closed
3 years ago
1
PDF decoder encoding issue. Found temporary fix.
#383
Vbansal21
closed
3 years ago
2
Scheduled biweekly dependency update for week 25
#382
pyup-bot
closed
3 years ago
2
Scheduled biweekly dependency update for week 23
#381
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 20
#380
pyup-bot
closed
3 years ago
1
fix: textract requirements
#379
marcus-campos
closed
3 years ago
0
Scheduled biweekly dependency update for week 18
#378
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 16
#377
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 14
#376
pyup-bot
closed
3 years ago
1
module 'six.moves' has no attribute 'collections_abc' for six==1.12.0
#375
Alex-apostolo
opened
3 years ago
0
The filename extensions .doc is not yet supported by textract
#374
harshu12345
opened
3 years ago
0
Truncated File error
#373
libgober
opened
3 years ago
1
Scheduled biweekly dependency update for week 11
#372
pyup-bot
closed
3 years ago
1
Update MacOS cask instruction
#371
shekhargulati
closed
3 years ago
0
Scheduled biweekly dependency update for week 09
#370
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 07
#369
pyup-bot
closed
3 years ago
1
update the beautifulsoup4 package version
#368
nagraj-edcast
closed
3 years ago
0
Text cut every 80 characters in .doc files
#367
vesran
opened
3 years ago
0
Scheduled biweekly dependency update for week 05
#366
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 03
#365
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 01
#364
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 51
#363
pyup-bot
closed
3 years ago
1
requirements/python - update dependency versions
#362
jedmonson
closed
3 years ago
1
Scheduled biweekly dependency update for week 49
#361
pyup-bot
closed
3 years ago
1
Extract text directly from file-object / file-content rather than using filename
#360
jrkkfst
opened
3 years ago
1
Scheduled biweekly dependency update for week 46
#359
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 44
#358
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 42
#357
pyup-bot
closed
3 years ago
1
Scheduled biweekly dependency update for week 40
#356
pyup-bot
closed
4 years ago
1
Scheduled biweekly dependency update for week 38
#355
pyup-bot
closed
4 years ago
1
Add options to minimize parsed html text
#354
aleks-v-k
opened
4 years ago
1
'ascii' codec can't decode byte 0xd0 in position 3: ordinal not in range(128)
#353
mapryl
opened
4 years ago
0
Scheduled biweekly dependency update for week 36
#352
pyup-bot
closed
4 years ago
1
AttributeError: module 'textract.parsers.docx_parser' has no attribute 'Parser'
#351
ShaileshSarda
opened
4 years ago
0
Is textract dead or alive (under active development)?
#350
MartinThoma
closed
4 years ago
2
Scheduled biweekly dependency update for week 33
#349
pyup-bot
closed
4 years ago
1
Add installation notes for FreeBSD
#348
sr-rolando
closed
3 years ago
0
Scheduled biweekly dependency update for week 31
#347
pyup-bot
closed
4 years ago
1
Scheduled biweekly dependency update for week 29
#346
pyup-bot
closed
4 years ago
1
Text is processed out of order with pdfminer
#345
samayer12
opened
4 years ago
0
Scheduled biweekly dependency update for week 27
#344
pyup-bot
closed
4 years ago
1
Extract text from encrypted files with key
#343
Sam-Gracy
opened
4 years ago
0
unsupported operand type(s) for +: 'NoneType' and 'bytes'
#342
tanguy-a
opened
4 years ago
0
Scheduled biweekly dependency update for week 24
#341
pyup-bot
closed
4 years ago
1
I think textract should also support Latex files. I hope this will be possible in the future!
#340
joebaumann
closed
4 years ago
1
Scheduled biweekly dependency update for week 22
#339
pyup-bot
closed
4 years ago
1
Textract fails on this specific attached page with a UnicodeDecodeError
#338
traverseda
opened
4 years ago
0
UnicodeDecodeErrors
#337
0x4A42
opened
4 years ago
1
Improve error feedback of importing exceptions
#336
wajdikhattel
opened
4 years ago
1
Fix exception error message
#335
wajdikhattel
opened
4 years ago
0
Previous
Next