deanmalmgren / textract

extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k stars 599 forks source link

MS PowerPoint 2003 (PPT) not yet supprted #235

Closed tomrwillis closed 5 years ago

tomrwillis commented 6 years ago

I'm writing a python script to scan for certain text in all flavor of MS Office docs and cannot find anything to read the text out of older MS Office PPT files

mikkhait commented 6 years ago

https://github.com/deanmalmgren/textract/issues/153

jpweytjens commented 5 years ago

Are you aware of any tool that can parse powerpoint files?

jpweytjens commented 5 years ago

ppt files will supported in the upcoming version with Libre Office as parser.