Closed healytwin1 closed 7 months ago
@o-smirnov @KshitijT @SpheMakh
Just a hunch, could you up the max locked memory (-l)?
It is the same as what is on ilifu, where I have no problems running ddcal. The only other thing that is different is that on ilifu I am using python 3.8.3 and on the meergas cluster it is 3.8.13.
Might this be due to limits via /proc/sys/fs/file-max ? Dane mentioned this is 8192 for meergas cluster. @o-smirnov what do you think?
Yes that's exactly it. The SSD method loves to open a huge amount of shared memory files. Can the admin increase the limit?
Yes that's exactly it. The SSD method loves to open a huge amount of shared memory files. Can the admin increase the limit?
pinging @healytwin1 and @dane-kleiner .
This is now sorted with the update to one of the /proc/ files on the cluster. It was a cluster issue not a ddfacet issue.
I am trying to test the ddcal worker so that we can implement the masking work around discussed at the last developers telecon, but I am getting
# OSError: [Errno 23] Too many open files in system: '/dev/shm/ddf.94/DATA:0:0'
I am using theddcal_mask
branch with stimela 1.7.6This is running on the meergas cluster which has 128 cpus (I have restricted it to 8), and 1TB of RAM,
ulimit -a
gives:which is more generous than what is available on ilifu where I am able to run ddfacet through the ddcal worker with no issues which is running caracal using the same branch, but with stimela 1.7.7.
Any ideas?