joezuntz / cosmosis

Other
22 stars 16 forks source link

What is lock file? #101

Closed audita-nimas closed 10 months ago

audita-nimas commented 11 months ago

Hi Joe,

I run cosmosis in the supercomputer and get the error like this:

Another CosmoSIS process was trying to use the same output file (output/output_3x2pt_SR_BOSS_hsc_case6.txt). 
This means one of three things:
1) you were trying to use MPI but left out the --mpi flag
2) you have another CosmoSIS run going trying to use the same filename
3) your file system cannot cope with file locks properly.  
In the last case you can set lock=F in the [output] section to disable this feature.

I already used MPI and run well in other supercomputer and I tried in new supercomputer but does not work. and also I did not running the cosmosis use the same filename. So I tried set lock=F, it works, so what is lock=F? Can I get the output from that?

joezuntz commented 11 months ago

Hi @audita-nimas

File locking is an operating system function to stop two programs using the same file at once. You can check if it is supported on your system with:

import fcntl
f = open("test_file_name_lock.txt", "w")
fcntl.lockf(f, fcntl.LOCK_EX|fcntl.LOCK_NB)
f.close()

if it produces an error message then your system does not support locking, and you can just set lock=F in the [output] section like it says to fix the issue.

If that's not the problem, and you're sure that options 1 and 2 don't apply, then you may have a problem with your MPI setup. You can check with:

XXX python -c "from mpi4py.MPI import COMM_WORLD as c;print(c.rank, c.size)

replacing XXX with whatever command you normally use to launch MPI jobs, e.g. mpirun -n <number of tasks> or srun -n <number of tasks> depending on your system.

joezuntz commented 10 months ago

Closing this now but please re-open if needed.