Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
9.21k stars 764 forks source link

bug/Enable a (global) way to set PIL.Image.MAX_IMAGE_PIXELS #3787

Open cwang opened 2 days ago

cwang commented 2 days ago

Describe the bug PIL has it's own default max pixels to prevent what they call "decompression bomb DOS attack" - it's 178956970 pixels as it stands now

To Reproduce Use a very large image exceeding the pixel number

Expected behavior As a lib, unstructured could expose a new env var to allow downstream to set the max pixels, whose value will be used inside the lib codebase to set PIL.Image.MAX_IMAGE_PIXELS as a result

Screenshots N/A

Environment Info N/A

Additional context

Maybe related to https://github.com/Unstructured-IO/unstructured/issues/3329