geodesymiami / rsmas_insar

RSMAS InSAR code
https://rsmas-insar.readthedocs.io/
GNU General Public License v3.0
62 stars 23 forks source link

Weired errors: step_io_load_list: bad array subscript in submit_jobs. #477

Closed falkamelung closed 3 years ago

falkamelung commented 3 years ago

after running run_09_merge_burst_igram:

KokoxiliBigChunk34SenAT143, , 44 jobs: 41 COMPLETED, 3 RUNNING , 0 PENDING , 0 WAITING   .
KokoxiliBigChunk34SenAT143, , 44 jobs: 42 COMPLETED, 2 RUNNING , 0 PENDING , 0 WAITING   .
/home1/05861/tg851601/code/rsmas_insar/minsar/submit_jobs.bash: line 238: step_io_load_list: bad array subscript
(standard_in) 2: syntax error
falkamelung commented 3 years ago

Another one after frequent timeouts, here for KokoxiliBigChunk36SenAT114

Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_23.job) with new walltime of 00:07:04
Resubmitted as jobumber: 7713109.
Timedout with walltime of 00:07:04.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:08:28
Resubmitted as jobumber: 7713117 7713182 7713184 7713185 7713186 7713187.
Timedout with walltime of 00:07:04.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_22.job) with new walltime of 00:08:28
Resubmitted as jobumber: 7713188.
Timedout with walltime of 00:08:28.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:10:09
Resubmitted as jobumber: 7713189 7713192 7713195 7713197 7713198 7713199.
Timedout with walltime of 00:10:09.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:12:10
Resubmitted as jobumber: 7713200 7713202 7713248 7713350 7713374 7713382.
Timedout with walltime of 00:12:10.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:14:36
Resubmitted as jobumber: 7713385 7713396 7713399 7713401 7713403 7713408.
Timedout with walltime of 00:14:36.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:17:31
Resubmitted as jobumber: 7713418 7713420 7713422 7713424 7713428 7713430.
Timedout with walltime of 00:17:31.
Resubmitting file (/scratch/05861/tg851601/KokoxiliBigChunk36SenAT114/run_files/run_09_merge_burst_igram_2.job) with new walltime of 00:21:01
Resubmitted as jobumber: 7713432 7713435 7713457 7713458 7713459 7713460.
/home1/05861/tg851601/code/rsmas_insar/minsar/submit_jobs.bash: line 238: step_io_load_list: bad array subscript
(standard_in) 2: syntax error
falkamelung commented 3 years ago

KokoxiliChunk32SenAT172: Got Segmentatiion fault but re-running worked fine. The run_08_generate_burst_igram_21_*.o looks fine. Might the Seg fault have occurred after the process was completed? Not according to the *.e.

cat run_08_generate_burst_igram_21_7804797.e
/tmp/rsmas_insar/3rdparty/launcher/launcher: line 93: 451944 Segmentation fault      SentinelWrapper.py -c /scratch/05861/tg851601/KokoxiliChunk32SenAT172/configs/config_generate_igram_20170826_20170919 > /scratch/05861/tg851601/KokoxiliChunk32SenAT172/run_files/run_08_generate_burst_igram_21_20170826_20170919_$LAUNCHER_JID.o 2> /scratch/05861/tg851601/KokoxiliChunk32SenAT172/run_files/run_08_generate_burst_igram_21_20170826_20170919_$LAUNCHER_JID.e
Ovec8hkin commented 3 years ago

I think I fixed this.