bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
9.11k stars 512 forks source link

Fix file locks in NFS-mounted directories #517

Closed tonywang16 closed 1 year ago

tonywang16 commented 1 year ago

Fix #515 unable to acquire shared lock on NFS mounted directory

borzunov commented 1 year ago

Hi @tonywang16,

Thanks for this bug fix! I'm curious about how did you come to the wb+ solution?

Also, can you please change the lock file mode to wb+ in throughput.py too?

tonywang16 commented 1 year ago

Hi @tonywang16,

Thanks for this bug fix! I'm curious about how did you come to the wb+ solution?

Also, can you please change the lock file mode to wb+ in throughput.py too?

Sure, updated the throughput.py.

The issue is due to it opens the file in "wb" (Open the file for writing) only mode on NFS mount.

Then it trying to acquire SHARED LOCK(LOCK_SH) which is commonly called a reader lock without read permission. Adding "+" allows for both types of locks on the same file descriptor.

BTW, the exclusive lock fcntl.LOCK_EX having no issue on NFS mount.