getomni-ai / zerox

PDF to Markdown with vision models
https://getomni.ai/ocr-demo
MIT License
6.68k stars 363 forks source link

Use uuid as file name #102

Closed ZeeshanZulfiqarAli closed 3 days ago

ZeeshanZulfiqarAli commented 3 days ago

If the file names are long, especially when the PDF has unicode file name & its name is url encoded, it might exceed the max allowed filename length in linux (and probably other OSes) which is 255 characters.

This would be fine in s3 as that has a file name limit of 1024 characters, but would result in ENAMETOOLONG errors when Zerox tries to download and write to the disk.

This PR prevents ENAMETOOLONG errors by using a UUIDv4 as the file name.

annapo23 commented 3 days ago

One small comment! Otherwise, LGTM! 🚀