Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
9.25k stars 767 forks source link

Add password with PDF files #3721

Open pprados opened 1 month ago

pprados commented 1 month ago

Add password with PDF files Must be combined with PR 392 in unstructured-inference

pprados commented 1 month ago

@plutasnyy can you activate the revue process ?

plutasnyy commented 1 month ago

Sure! Could you add some unit tests? Ideally some PDF with password that is not extractable with current main code.

Please also bump the __version__ and changelog.md you see example in other PRs for example https://github.com/Unstructured-IO/unstructured/pull/3734/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR1

pprados commented 1 month ago

@plutasnyy Done. I add TU.

pprados commented 1 month ago

@plutasnyy can you test again ?

pprados commented 18 hours ago

@plutasnyy, can you revue the code ?