Closed TheTechromancer closed 1 week ago
Thanks for @TheTechromancer reporting this. In version 0.2.0, we changed the API to return a tuple of reader and metadata. add this to your extract call: reader, metada = extractor.extract_ ... Please look at the updated Docs
Thanks yeah we were able to fix it. Is there a chance there will be another breaking API change without a major version increase? If so, going forward we can pin the version on our side.
I don't see any breaking changes coming up, you can pin your version
Hi, today I noticed a sudden change in the way text is extracted from PDFs. It seems like a lot of the binary content is being included. This is causing our tests to fail:
We've been able to resolve this quickly on our end by downgrading the package version; but just wanted to give you guys a heads-up.
EDIT: On further investigation, it looks like a change in the python API caused the issue: