I usually submit lots of job in parallel, particularly with hiresclim and ECmean. With the later, I end up with concurrent processes writing to the same files (_globtable.txt, globtablecs.txt, and gregory.txt) at the same time. It just happened while processing only 10 experiments:
for k in {1..5}; do ./ecm.sh k_0${k} 1950 1969; done
for k in {1..5}; do ./ecm.sh l_0${k} 1950 1969; done
The line for experiment l_05 is truncated. Not shown, the line for l_01 ends with some garbage. I was aware of that issue (not the first time) and had put a TODO in the EC-mean.sh code. So far my solution is to repeat the calls to tab2lin.sh on the command line:
for k in {1..5}; do $PIDIR/tab2lin.sh k_0${k} 1950 1969 >> ./globtable.txt; done
for k in {1..5}; do $PIDIR/tab2lin.sh l_0${k} 1950 1969 >> ./globtable.txt; done
Then I have a clean global table, and the experiments order is the same as the calls order:
I usually submit lots of job in parallel, particularly with
hiresclim
andECmean
. With the later, I end up with concurrent processes writing to the same files (_globtable.txt, globtablecs.txt, and gregory.txt) at the same time. It just happened while processing only 10 experiments:Then I end up with a globtable.txt like this:
The line for experiment
l_05
is truncated. Not shown, the line forl_01
ends with some garbage. I was aware of that issue (not the first time) and had put a TODO in the EC-mean.sh code. So far my solution is to repeat the calls to tab2lin.sh on the command line:Then I have a clean global table, and the experiments order is the same as the calls order:
There are at least two ways to address this issue:
flock
(but there may be some issue over NFS file systems, which some of us use)tab2lin.sh
fromEC-mean.sh
)