Hi @Ovec8hkin
When check_job_outputs.py detects an error, run_workflow.bash exits properly, but it does not show any error message. It shows:
run_workflow.bash /scratch/05861/tg851601/unittestGalapagosSenDT128 --start 5
Started at: 2021-09-26 13:18:00
Jobfiles to run:
/scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
---------------------------------------------------------------------------------------------------------------------------------------------------------
| | | Step | Total | Step | | |
| | Extra | active | active | processed | Active | |
| File Name | tasks | tasks | tasks | jobs | jobs | Message |
---------------------------------------------------------------------------------------------------------------------------------------------------------
| run_05_ful...e_0.job | 6 | 0/400 | 0/500 | 1/1 | 0/1 | Submitted: 8509633 |
---------------------------------------------------------------------------------------------------------------------------------------------------------
Jobs submitted: 8509633
unittestGalapagosSenDT128, run_05_fullBurst_resample, 1 jobs : 1 COMPLETED , 0 RUNNING , 0 PENDING , 0 WAITING .
check_job_outputs.py /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
This is the Open Source version of ISCE.
Some of the workflows depend on a separate licensed package.
To obtain the licensed package, please make a request for ISCE
through the website: https://download.jpl.nasa.gov/ops/request/index.cfm.
Alternatively, if you are a member, or can become a member of WinSAR
you may be able to obtain access to a version of the licensed sofware at
https://winsar.unavco.org/software/isce
checking *.e, *.o from /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
no error found
Jobfiles to run:
/scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0.job
---------------------------------------------------------------------------------------------------------------------------------------------------------
| | | Step | Total | Step | | |
| | Extra | active | active | processed | Active | |
| File Name | tasks | tasks | tasks | jobs | jobs | Message |
---------------------------------------------------------------------------------------------------------------------------------------------------------
| run_06_ext...n_0.job | 1 | 0/400 | 6/500 | 1/1 | 0/1 | Submitted: 8509649 |
---------------------------------------------------------------------------------------------------------------------------------------------------------
Jobs submitted: 8509649
unittestGalapagosSenDT128, run_06_extract_stack_valid_region, 1 jobs : 1 COMPLETED , 0 RUNNING , 0 PENDING , 0 WAITING .
check_job_outputs.py /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0.job
On the other hand, the workflow*.log shows all the error information:
cat workflow.2.log
Started at: 2021-09-26 13:18:00
Jobfiles to run:
/scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
---------------------------------------------------------------------------------------------------------------------------------------------------------
| | | Step | Total | Step | | |
| | Extra | active | active | processed | Active | |
| File Name | tasks | tasks | tasks | jobs | jobs | Message |
---------------------------------------------------------------------------------------------------------------------------------------------------------
| run_05_ful...e_0.job | 6 | 0/400 | 0/500 | 1/1 | 0/1 | Submitted: 8509633 |
---------------------------------------------------------------------------------------------------------------------------------------------------------
Jobs submitted: 8509633
unittestGalapagosSenDT128, run_05_fullBurst_resample, 1 jobs : 1 COMPLETED , 0 RUNNING , 0 PENDING , 0 WAITING .
check_job_outputs.py /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
This is the Open Source version of ISCE.
Some of the workflows depend on a separate licensed package.
To obtain the licensed package, please make a request for ISCE
through the website: https://download.jpl.nasa.gov/ops/request/index.cfm.
Alternatively, if you are a member, or can become a member of WinSAR
you may be able to obtain access to a version of the licensed sofware at
https://winsar.unavco.org/software/isce
checking *.e, *.o from /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_05_fullBurst_resample_0.job
no error found
Jobfiles to run:
/scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0.job
---------------------------------------------------------------------------------------------------------------------------------------------------------
| | | Step | Total | Step | | |
| | Extra | active | active | processed | Active | |
| File Name | tasks | tasks | tasks | jobs | jobs | Message |
---------------------------------------------------------------------------------------------------------------------------------------------------------
| run_06_ext...n_0.job | 1 | 0/400 | 6/500 | 1/1 | 0/1 | Submitted: 8509649 |
---------------------------------------------------------------------------------------------------------------------------------------------------------
Jobs submitted: 8509649
unittestGalapagosSenDT128, run_06_extract_stack_valid_region, 1 jobs : 1 COMPLETED , 0 RUNNING , 0 PENDING , 0 WAITING .
check_job_outputs.py /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0.job
This is the Open Source version of ISCE.
Some of the workflows depend on a separate licensed package.
To obtain the licensed package, please make a request for ISCE
through the website: https://download.jpl.nasa.gov/ops/request/index.cfm.
Alternatively, if you are a member, or can become a member of WinSAR
you may be able to obtain access to a version of the licensed sofware at
https://winsar.unavco.org/software/isce
checking *.e, *.o from /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0.job
Error: "Error" found in /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0__1.e
Error: "FileNotFoundError" found in /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0__1.e
Error: "Traceback" found in /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region_0__1.e
For known issues see https://github.com/geodesymiami/rsmas_insar/tree/master/docs/known_issues.md
Traceback (most recent call last):
File "/work2/05861/tg851601/stampede2/code/rsmas_insar/minsar/check_job_outputs.py", line 224, in <module>
main()
File "/work2/05861/tg851601/stampede2/code/rsmas_insar/minsar/check_job_outputs.py", line 194, in main
raise RuntimeError('Error in run_file: ' + run_file_base)
RuntimeError: Error in run_file: /scratch/05861/tg851601/unittestGalapagosSenDT128/run_files_tmp/run_06_extract_stack_valid_region
Error in run_workflow.bash: check_job_outputs.py exited with code (1).
/work2/05861/tg851601/stampede2/code/rsmas_insar/minsar/run_workflow.bash: line 1: 288278 Terminated tail -f $logfile_name
Can you modify so that errors arealso displayed to screen?
To create an error, just run the example data, then comment copy_to_tmp out in the run_05_*.job and run starting with step 5 (run_workflow.bash $PWD --start 5.
I've actually known about this for a while. It's something to do with how the script terminates and how that termination handle buffer flushing. It might not be easily fixable.
Hi @Ovec8hkin When
check_job_outputs.py
detects an error,run_workflow.bash
exits properly, but it does not show any error message. It shows:On the other hand, the workflow*.log shows all the error information:
Can you modify so that errors arealso displayed to screen?
To create an error, just run the example data, then comment
copy_to_tmp
out in therun_05_*.job
and run starting with step 5 (run_workflow.bash $PWD --start 5
.