run_files to master transition: missing items

geodesymiami / rsmas_insar

RSMAS InSAR code

https://rsmas-insar.readthedocs.io/

GNU General Public License v3.0

62 stars 23 forks source link

run_files to master transition: missing items #153

Closed falkamelung closed 5 years ago

falkamelung commented 5 years ago

Hi @mirzaees , here a few items:

I don’t see any pysar and insarmaps options. Are they there?
did you transfer the —walltime option?
Is there a way to start at any step with execute_runfiles? execute_runfiles.py —help gives an error. When I get a pegasus error in run_5 I rerun using execute_stack_sentinel_run_files.py —startrun 5.
Ideally we have the same option in process_rsmas.py, so that you can start at any run_file. I tried to implement but was then out of time.

I don't see the following code in execute_runfiles.py. Do you have this elsewhere? Using my execute_runfiles may just work.

vlong, long, short = '3:00', '1:00', '0:30'
        vlong, long, short = '6:00', '2:00', '1:00'

        item_name = os.path.basename(item)

        if item_name == 'run_1_unpack_slc_topo_master':
            walltimelimit = long
        if item_name == 'run_2_average_baseline':
            walltimelimit = short

It would be nice to have the ability to read vlong, long, short through the template file. That is the next item I was consider to do. If you could add this (or add the thoughts how to do this) that would be great.

Once these issues are resolved lets switch. I wish we would have had this before I got crazy with the codebase!

mirzaees commented 5 years ago

Hi @falkamelung,

please first update because I fixed some minor bugs that your error from execute_runfiles.py may have come from.

process_rsmas.py runs with only 2 steps (download, process). if you don't mention the step it runs both of them:

process_rsmas.py $TE/template                            or
process_rsmas.py $TE/template  --dostep  download        or
process_rsmas.py $TE/template  --dostep  process

in download step, it creates a run file named: run_0_download_data_and_dem and then executes this with: execute_runfiles $TE/template 0 0 in which 0 is the start and end of run_files to be executed. here it is only running run_0_*. both downlod_rsmas.py and dem_rsmas.py are called in this run.
in process step, it creates all the other run files including from isce, pysar, insarmaps,... just like below and then calls execute_runfiles.py to run one by one. for developement, one only needs to add his script to these run files which I will show you how if you want (easy).

/run_files/run_1_unpack_slc_topo_master
/run_files/run_2_average_baseline
/run_files/run_3_geo2rdr_resample
/run_files/run_4_extract_stack_valid_region
/run_files/run_5_merge_burst_igram
/run_files/run_6_filter_coherence
/run_files/run_7_merge_master_slave_slc
/run_files/run_8_unwrap
/run_files/run_9_pysar_small_baseline
/run_files/run_10_amplitude_ortho_geo
/run_files/run_11_email_pysar
/run_files/run_12_ingest_insarmaps
/run_files/run_13_email_insarmaps

to run these files you only need to call: execute_runfiles.py $TE/template to run all starting from run_1_*. but if you want to seperately run each, you have to give the number of file. for example to run run_7_merge_master_slave_slc . you need to call execute_runfiles.py $TE/template 7 7
both create_runfile.py and process_rsmas.py have similar options except that one is step the other is dostep which we can make it unique if you say:
```
create_runfiles.py  $TE/template --step download
process_rsmas.py   $TE/template --dostep download
```
I have put all job default values in rinsar/defaults/job_defaults.cfg. I don't see any advantage of using short, long ,... style because in that case you need to define which job is how, except that you say short instead of giving the value for walltime. I mean there is no difference, we can directly give the wall time value in rinsar/defaults/job_defaults.cfg
I have added run_*_amplitude_ortho_geo to create both Ortho- and Georectified images. both of them are now supported. if you want to run the script separately, you need to call: export_ortho_geo.py $TE/template . The output is all backscatters with prefix Ortho_or Geo_ which you can find in folder GeoOrtho_tiff. this script creates another folder named geom_master_noDEM that contains lat and lon without DEM correction. they are used for georectified products. for orthorectified products, it is using lat and lon in merged/geom_master.
From yesterday, something has happened to ssara_federated_query* and it does not work!

falkamelung commented 5 years ago

@mirzaees Thank you. It seems to work now. I have to look at it in more detail. Can we we now specify the walltime for a specific job in the template file? How would I specify a longer wall time for run_unpack_slc_topo_master?

What I tried to achieve with vlong, long, short is scaling of the walltime with the job size. Instead of specifying 12 different wall times when we go from 5 to 20 bursts, we just need to give one parameter for the size of the area processed (number of bursts or numbers of pixels for squeezar), and the walltimes are calculated accordingly. If the default apples for 5 bursts, when we say 20 bursts, we would just multiply all wall times by a factor of 4.

mirzaees commented 5 years ago

Ok I see your point. It is not able to read from template right now but I will add it. I will add a parameter to the template as computation_time that accepts short, long and vlong to adjust the wall times.

On May 1, 2019, at 9:13 PM, Falk Amelung notifications@github.com<mailto:notifications@github.com> wrote:

WARNING: This email originated outside of the University of Miami. Do not click links or attachments unless you recognize the sender and know the content is safe.

@mirzaeeshttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmirzaees&data=02%7C01%7Csara.mirzaee%40rsmas.miami.edu%7C64e16a406e334decf30608d6ce9b74c0%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636923564412155290&sdata=p89%2FMhRRvmCUMyp%2FkD8bnHjfCbM478E9dPdcbQXlYjw%3D&reserved=0 Thank you. It seems to work now. I have to look at it in more detail. Can we we now specify the walltime for a specific job in the template file? How would I specify a longer wall time for run_unpack_slc_topo_master?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgeodesymiami%2Frsmas_insar%2Fissues%2F153%23issuecomment-488527993&data=02%7C01%7Csara.mirzaee%40rsmas.miami.edu%7C64e16a406e334decf30608d6ce9b74c0%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636923564412155290&sdata=QOzW5ZO3LODoLZYchzbK1QV7ftt7dOiwZtD7anYRoYk%3D&reserved=0, or mute the threadhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAI4MCE2VY3AZMM3SWYE2J2TPTI55NANCNFSM4HJRRKNA&data=02%7C01%7Csara.mirzaee%40rsmas.miami.edu%7C64e16a406e334decf30608d6ce9b74c0%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636923564412165298&sdata=JQDsS3TliRXa9br7pJwsmj6X4nMwGvUpQY5iYSAERHg%3D&reserved=0.

falkamelung commented 5 years ago

But do it smarter than I did. What you write above does not sound right. I just allowed for 3 different walltimes, which seems enough. We can have a function job_submission_defaults = scale_job_submission_defaults(number_of_bursts=10), and we give the number of bursts with the template (default 5), alternatively we could say job_submission_defaults = scale_job_submission_defaults(area_processed='2000*2000') (default='1000*1000'). If you are short in time lets do for now just quick-and-dirty giving-the-walltime-on-template.

falkamelung commented 5 years ago

Actually, did it work for you? I got the PYSAR error below. There is a path/directory problem. Running pysarApp from the project_dir works fine.

/login3/projects/scratch/insarlab/famelung/TESTBENCH1/unittestGalapagosSenDT128/run_files[1056] cat ./run_8_pysar_small_baseline_0_20500916.e
Traceback (most recent call last):
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/prep_isce.py", line 462, in <module>
    main() 
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/prep_isce.py", line 454, in main
    update_mode=inps.update_mode)
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/prep_isce.py", line 402, in prepare_stack
    raise FileNotFoundError('no file found in pattern: {}'.format(filePattern))
FileNotFoundError: no file found in pattern: filt_*.unw
Traceback (most recent call last):
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/pysarApp.py", line 1061, in <module>
    main()
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/pysarApp.py", line 1051, in main
    app.run(steps=inps.runSteps, plot=inps.plot)
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/pysarApp.py", line 980, in run
    self.run_load_data(sname)
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/pysarApp.py", line 338, in run_load_data
    load_complete, stack_file, geom_file = ut.check_loaded_dataset(self.workDir, print_msg=True)[0:3]
  File "/nethome/famelung/test/development/rsmas_insar/sources/PySAR/pysar/utils/utils.py", line 59, in check_loaded_dataset
    raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), './INPUTS/ifgramStack.h5')
FileNotFoundError: [Errno 2] No such file or directory: './INPUTS/ifgramStack.h5'

mirzaees commented 5 years ago

Hi Falk,

Good idea, I will fix it today

About the error from pysar, I think it is pysar bug. I run all steps with execute_runfiles.py and test data, it worked fine

Sara

On May 2, 2019, at 1:51 AM, Falk Amelung notifications@github.com<mailto:notifications@github.com> wrote:

WARNING: This email originated outside of the University of Miami. Do not click links or attachments unless you recognize the sender and know the content is safe.

But do it smarter than I did. What you write above does not sound right. I just allowed for 3 different walltimes, which seems enough. We can have a function job_submission_defaults = scale_job_submission_defaults(number_of_bursts=10), and we give the number of bursts with the template (default 5), alternatively we could say job_submission_defaults = scale_job_submission_defaults(area_processed='20002000') (default='10001000'). If you are short in time lets do for now just quick-and-dirty giving-the-walltime-on-template.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgeodesymiami%2Frsmas_insar%2Fissues%2F153%23issuecomment-488559973&data=02%7C01%7Csara.mirzaee%40rsmas.miami.edu%7Cf9496deac4d24f49002c08d6cec2397d%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636923730921502472&sdata=XqUcdBtRMj%2FQbKrEjavPnJAb2roK6fruG%2BhxyhV6bVk%3D&reserved=0, or mute the threadhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAI4MCE7EHR5HQ3QT6ZGQCULPTJ6ODANCNFSM4HJRRKNA&data=02%7C01%7Csara.mirzaee%40rsmas.miami.edu%7Cf9496deac4d24f49002c08d6cec2397d%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636923730921502472&sdata=WS5JIepVvlgAIJ5pmcuUHqlWd9CQwxKupMrC8OKRR08%3D&reserved=0.

mirzaees commented 5 years ago

Hi @falkamelung here some update:

regarding what you asked for scaling job walltimes, I made a function in process_utilities called walltime_adjust. now we don't need to add an option in template file, it will count the number of bursts based on boundingBox and multiplies with the defaults walltimes. I set some short values for the walltimes and those which need to be scaled are flagged with adjust=True in therinsar/defaults/job_defaults.cfg. you can change them if not appropriate
I saw your count_burst.py. that is good but we cannot use it until after first run when the geom_master folder is created. so I thought maybe calculating based on boundingBox and default sentinel Burst size would be more general. (that is how the function walltime_adjust is working)
create_batch was slightly modified to be able to call it like below for submitting a batch_file:

jobs = cb.submit_batch_jobs(batch_file=item, 
                            out_dir=os.path.join(inps.work_dir, 'run_files'),
                            memory=memorymax, walltime=walltimelimit, queue=queuename)

startDate and stopDate format are changed to YYYYMMDD in template

I am working on implementing Dask (same as ifgram inversion) for some scripts in rinsar and pysqsar. also on export_ortho_geo.py to make the main short and the script more clear.

falkamelung commented 5 years ago

Thank you for the update. Sounds good. I am looking forward to try. Two comments:

It will be good to add the —walltime option to process_rsmas and execute_runfiles (and probably more scripts). If you say —walltime=0.10 then you don’t have to wait untils short jobs start.
The number of bursts is below. Maybe we can start the script, kill it, and set wall times with the number of bursts. There may be a simple function to get it.

grep Bursts run_1_unpack_slc_topo_master_0_20582539.o Number of Bursts before cropping: 9 Number of Bursts after cropping: 3 Number of Bursts before cropping: 9 Number of Bursts after cropping: 9 Number of Bursts before cropping: 6 Number of Bursts after cropping: 3 Number of Bursts before cropping: 9 Number of Bursts after cropping: 1 Number of Bursts before cropping: 9 Number of Bursts after cropping: 9 Number of Bursts before cropping: 7 Number of Bursts after cropping: 5

On May 5, 2019, at 12:42 AM, Sara Mirzaee notifications@github.com<mailto:notifications@github.com> wrote:

Hi @falkamelunghttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffalkamelung&data=02%7C01%7Cfamelung%40rsmas.miami.edu%7C4f3802dae639480686d508d6d0af7eaa%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636925849497197308&sdata=ByR%2FinxSZwMkh49edO%2BoG0apN0mbhHzXBY%2BY1gE68PI%3D&reserved=0 here some update:

regarding what you asked for scaling job walltimes, I made a function in process_utilities called walltime_adjust. now we don't need to add an option in template file, it will count the number of bursts based on boundingBox and multiplies with the defaults walltimes. I set some short values for the walltimes and those which need to be scaled are flagged with adjust=True in the rinsar/defaults/job_defaults.cfg. you can change them if not appropriate
I saw your count_burst.py. that is good but we cannot use it until after first run when the geom_master folder is created. so I thought maybe calculating based on boundingBox and default sentinel Burst size would be more general. (that is how the function walltime_adjust is working)
create_batch was slightly modified to be able to call it like below for submitting a batch_file:

jobs = cb.submit_batch_jobs(batch_file=item, out_dir=os.path.join(inps.work_dir, 'run_files'), memory=memorymax, walltime=walltimelimit, queue=queuename)

startDate and stopDate format are changed to YYYYMMDD in template

I am working on implementing Dask (same as ifgram inversion) for some scripts in rinsar and pysqsar. also on export_ortho_geo.py to make the main short and the script more clear.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgeodesymiami%2Frsmas_insar%2Fissues%2F153%23issuecomment-489343186&data=02%7C01%7Cfamelung%40rsmas.miami.edu%7C4f3802dae639480686d508d6d0af7eaa%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636925849497207317&sdata=MGjERBHcHYaMoHULXulZaFxPFcj2Uz9XDyiBBFFTc1s%3D&reserved=0, or mute the threadhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACVFHXBCQK5KBNEXFB5T7X3PTW4HFANCNFSM4HJRRKNA&data=02%7C01%7Cfamelung%40rsmas.miami.edu%7C4f3802dae639480686d508d6d0af7eaa%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636925849497207317&sdata=kcpOrxpmDmiC%2B%2BSCjTAbOJdCPZS5FuoXftLTgMhSwhE%3D&reserved=0.

mirzaees commented 5 years ago

following your suggestion, I modified the function for walltime_adjust based on isce tools. now the number of bursts can be extracted and then used for walltime scaling.
I don't get why you want to add —walltime option to the scripts, is it because of --submit option (to be used for submitting the script as a job?). because all run_files have defaults and are adjusted automatically.

falkamelung commented 5 years ago

Cool re 1! Re 2, yes for —submit. I sometimes give —submit —walltime 0:10 just to start something.

On May 5, 2019, at 9:13 AM, Sara Mirzaee notifications@github.com<mailto:notifications@github.com> wrote:

following your suggestion, I modified the function for walltime_adjust based on isce tools. now the number of bursts can be extracted and then used for walltime scaling.
I don't get why you want to add —walltime option to the scripts, is it because of --submit option (to be used for submitting the script as a job?). because all run_files have defaults and are adjusted automatically.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgeodesymiami%2Frsmas_insar%2Fissues%2F153%23issuecomment-489440378&data=02%7C01%7Cfamelung%40rsmas.miami.edu%7C502d3685ed4543e59a1a08d6d17495ed%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636926696015959159&sdata=TzjK8O8cYv3ubsLSCVxF3b7N2lP0L4Zn6jezyQPQEX0%3D&reserved=0, or mute the threadhttps://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACVFHXDAP6XQXWXPH3QWAO3PT4BR3ANCNFSM4HJRRKNA&data=02%7C01%7Cfamelung%40rsmas.miami.edu%7C502d3685ed4543e59a1a08d6d17495ed%7C2a144b72f23942d48c0e6f0f17c48e33%7C0%7C0%7C636926696015959159&sdata=W0YM3J4Y9mF71VIkfSwz%2BkYIpKlodKPGJpu99IBlbRc%3D&reserved=0.