I am trying to use textract to do the obvious with docx files in a AWS Lambda using python. Textract library is included in the package, as is the dependency - docx2txt. I try getting the text out of the file, but still getting the ExtensionNotSupported stating that docx is not supported. I tried putting the doc2txt library in the parsers folder too - didn't help.
Hello,
I am trying to use textract to do the obvious with docx files in a AWS Lambda using python. Textract library is included in the package, as is the dependency - docx2txt. I try getting the text out of the file, but still getting the ExtensionNotSupported stating that docx is not supported. I tried putting the doc2txt library in the parsers folder too - didn't help.
Using: