RDCEP / wth_gen

Fortran code for generating lots of weather files for crop models from NetCDF files.
3 stars 1 forks source link

diagnostic output in $out_dir collides when running more than one period #1

Closed nbest937 closed 12 years ago

nbest937 commented 12 years ago

I have make set up to run multiple nc_wth_gen processes simultaneously. It seems like these processes are trying to write diagnostic output to the same location and the program throws an error when a file (or directory?) that it wants to create is alrady there. We should consider how to avoid this failure mode.

make -k -j5 -l6 wth_gen
make --directory=wth_gen all
make[1]: Entering directory `/scratch/local/isi-mip-input/wth_gen'
./nc_wth_gen 1950 1980 /scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES /scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC1.WTH 2 1 > log/GENERIC1.LOG
./nc_wth_gen 1980 2010 /scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES /scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC2.WTH 2 2 > log/GENERIC2.LOG
./nc_wth_gen 2010 2040 /scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES /scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC3.WTH 2 3 > log/GENERIC3.LOG
./nc_wth_gen 2040 2070 /scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES /scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC4.WTH 2 4 > log/GENERIC4.LOG
./nc_wth_gen 2070 2100 /scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES /scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC5.WTH 2 5 > log/GENERIC5.LOG
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1950_1980.p1.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1980_2010.p2.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2010_2040.p3.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2040_2070.p4.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2070_2100.p5.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1950_1980.p1.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1980_2010.p2.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2010_2040.p3.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2040_2070.p4.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2070_2100.p5.txt': File exists
At line 274 of file nc_wth_gen.f90
Fortran runtime error: No such file or directory
make[1]: *** [GENERIC4.WTH] Error 2
STOP 1
make[1]: *** [GENERIC5.WTH] Error 1
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1950_1980.p1.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1980_2010.p2.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2010_2040.p3.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2040_2070.p4.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2070_2100.p5.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1950_1980.p1.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1980_2010.p2.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2010_2040.p3.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2040_2070.p4.txt': File exists
mkdir: cannot create directory `/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2070_2100.p5.txt': File exists
At line 274 of file nc_wth_gen.f90
Fortran runtime error: No such file or directory
make[1]: *** [GENERIC3.WTH] Error 2

Somehow two nc_wth_gen processes are still running. I would have thought that only the first one to create these diag. files would survive. Maybe the attempt to create is only triggered by some specific condition that did not occur in the second process. I will update this when more output or insight is available.

joshuaelliott commented 12 years ago

the wth_gen script is set up to do a run in 1-8 processes simultaneously as is right? does it make sense to further expand it to more processes with make? how are you dividing these processes?

On Tue, Jun 12, 2012 at 4:58 PM, Neil Best < reply@reply.github.com

wrote:

I have make set up to run multiple nc_wth_gen processes simultaneously. It seems like these processes are trying to write diagnostic output to the same location and the program throws an error when a file (or directory?) that it wants to create is alrady there. We should consider how to avoid this failure mode.

make -k -j5 -l6 wth_gen
make --directory=wth_gen all
make[1]: Entering directory `/scratch/local/isi-mip-input/wth_gen'
./nc_wth_gen 1950 1980
/scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES
/scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC1.WTH 2 1 >
log/GENERIC1.LOG
./nc_wth_gen 1980 2010
/scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES
/scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC2.WTH 2 2 >
log/GENERIC2.LOG
./nc_wth_gen 2010 2040
/scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES
/scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC3.WTH 2 3 >
log/GENERIC3.LOG
./nc_wth_gen 2040 2070
/scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES
/scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC4.WTH 2 4 >
log/GENERIC4.LOG
./nc_wth_gen 2070 2100
/scratch/local/isi-mip-input/wth_gen_input/HadGEM2-ES
/scratch/local/isi-mip-input/grid/HadGEM2-ES GENERIC5.WTH 2 5 >
log/GENERIC5.LOG
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1950_1980.p1.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1980_2010.p2.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2010_2040.p3.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2040_2070.p4.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2070_2100.p5.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1950_1980.p1.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1980_2010.p2.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2010_2040.p3.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2040_2070.p4.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2070_2100.p5.txt':
File exists
At line 274 of file nc_wth_gen.f90
Fortran runtime error: No such file or directory
make[1]: *** [GENERIC4.WTH] Error 2
STOP 1
make[1]: *** [GENERIC5.WTH] Error 1
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1950_1980.p1.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.1980_2010.p2.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2010_2040.p3.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2040_2070.p4.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/precip_diag.2070_2100.p5.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1950_1980.p1.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.1980_2010.p2.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2010_2040.p3.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2040_2070.p4.txt':
File exists
mkdir: cannot create directory
`/scratch/local/isi-mip-input/grid/HadGEM2-ES/temp_diag.2070_2100.p5.txt':
File exists
At line 274 of file nc_wth_gen.f90
Fortran runtime error: No such file or directory
make[1]: *** [GENERIC3.WTH] Error 2

Somehow two nc_wth_gen processes are still running. I would have thought that only the first one to create these diag. files would survive. Maybe the attempt to create is only triggered by some specific condition that did not occur in the second process. I will update this when more output or insight is available.


Reply to this email directly or view it on GitHub: https://github.com/RDCEP/wth_gen/issues/1

Joshua W. Elliott Research Scientist and Fellow Computation Institute 5735 S. Ellis Ave. Chicago, IL 60637 Tel: 773-834-6812; Fax: 773-834-6818 E-mail: jelliott@ci.uchicago.edu Links: Personal websitehttps://sites.google.com/site/joshuawrightelliott/; SSRN papers http://ssrn.com/author=1655092;

nbest937 commented 12 years ago

Thanks, Joshua. When I read your comment something clicked in my head. The wth_gen/run script uses qsub to parallelize the job on the cluster, but I am using make to parallelize it on a single node. I interpreted the $n_procs parameter to set the number of cores that each invocation of nc_wth_gen would use, but rather it's the total number of jobs that will be running. Because we have 5 time periods that number should be 5 it seems. This explains why 2 calls were successful (well, more successful -- see issue 2). I'm trying it now. If GENERIC[345].WTH files start appearing then I can close this issue.