-
**Describe the bug**
DEPRECATION: textract 1.6.5 has a non-standard dependency specifier extract-msg
-
What technical hurdles prevent a node version of the ODF reading and writing parts of WebODF? Is it possible to create a NPM module (http://npmjs.org/) akin to https://www.npmjs.org/package/xlsx for …
-
It looks like the only way to capture the output of amazon-textract is to redirect it into a file. Such as:
amazon-textract --input-document "s3://somebucket/2022-04-16-0010.jpg" --pretty-print LI…
-
Currently, it appears there's no check for whether the file has actually changed before rerunning textract so it probably reruns even if the user has only updated the title.
@gasman and I were disc…
-
[test.rtf.zip](https://github.com/deanmalmgren/textract/files/907377/test.rtf.zip)
```
res = textract.process("test.rtf").decode(encoding='UTF-8')
assert "æøå" in res
FAIL
```
```
textract.…
-
Hello, this issue seems very similar to #136 , but I just can't make it work: the word and line order inside table cells is not preserved when invoking the get_text method.
The json attached is a r…
-
Hi everybody,
I just installed and tried this very interesting module.
I'm particularly interested in jpeg conversion to text and it seems to work, but not as expected.
Trying these command inside …
-
I wanted to extract text from an URL, URL contains a pdf file that is hosted on firebase.
Now I'm facing this issue with URL only it is working correctly with local pdf file.
Here are the logs: …
-
## Dependencies
``` clojure
com.cognitect.aws/api {:mvn/version "0.8.539"}
com.cognitect.aws/endpoints {:mvn/version "1.1.12.110"}
com.cognitect.aws/textract {:mvn/…
-
Textract cannot be installed with the current version of Wheel (0.40.0) because https://github.com/pypa/wheel/issues/520 made it so that .* suffix can only be used with == or != operators. I receive t…