Our team finds it convenient to use NFSv4.1 filesystems when training our models with ducttape. One issue that we've come across is that on startup ducttape will acquire file-locks on more than 256 files at the same time. This causes a crash on the filesystem we're using (Amazon EFS).
I understand why these locks are necessary, but I'm wondering if anyone has ideas for a work-around that could allow us to avoid holding these concurrent file locks?
Our team finds it convenient to use NFSv4.1 filesystems when training our models with ducttape. One issue that we've come across is that on startup ducttape will acquire file-locks on more than 256 files at the same time. This causes a crash on the filesystem we're using (Amazon EFS).
I understand why these locks are necessary, but I'm wondering if anyone has ideas for a work-around that could allow us to avoid holding these concurrent file locks?