radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

Add timeout to `LustreHSMNodeIO._restore_wait` #188

Open ketiltrout opened 2 months ago

ketiltrout commented 2 months ago

To avoid the potential for locking-up all the workers due to restore requests being abandonned/forgotten about/ignored by HSM without alpenhorn's knowledge.

https://github.com/radiocosmology/alpenhorn/blob/4670ec285f810bef19199fd4a9cc2848d81e539e/alpenhorn/io/lustrehsm.py#L113

Probably best to have something configurable via conf file, but I want to run as-is for a while to see what timescale would be reasonable as a default here.