geodesymiami / rsmas_insar

RSMAS InSAR code
https://rsmas-insar.readthedocs.io/
GNU General Public License v3.0
62 stars 23 forks source link

`sys.exit(0)` in `create_batch.submit_script()` prematurely terminates calling subprocess call in `run_operations.py` #103

Closed Ovec8hkin closed 5 years ago

Ovec8hkin commented 5 years ago

sys.exit(0), as used at the end of create_batch.submit_script() terminates the calling process upon execution. This is good for scripts called from the command line that should be terminated in certain cases, but is bad in complex routines like process_rsmas.py which are in turn called from other scripts.

In run_operations.py, process_rsmas.py is called via a subprocess command, but, upon reaching the above sys.exit(0) command above, the entire subprocess command terminates -- prematurely in this case.

However, removing the sys.exit(0) call completely results in some odd behavior in process_rsmas, so we need to come up with a different way to replicate the functionality of the sys.exit(0) call, without terminating the process.

falkamelung commented 5 years ago

Hmmmmm...check how createBatch.pl exited. That worked well with run_operations. There is an rsmas_old* repo on github which should have it.

falkamelung commented 5 years ago

So, you think run_operations was broken by the introduction of create_batch.py? I am not certain about the timing as we did not use run_operations during the government shutdown, but I believe this is what happened.

Here is the old script that was used before. It seems to have an exit(0) at the end. https://github.com/geodesymiami/roipac/blob/master/INT_SCR/createBatch_LSF.pl

Ovec8hkin commented 5 years ago

create_batch.py did nothing wrong. Just its current setup means it can't really be used within subprocess calls. I may be able to modify run_operations.py to deal with it, but in general, @2gotgrossman and I don't think we should be using the sys.exit(0) pattern in the first place. So it might be a good idea to consider how to get rid of that call.

gravelcycles commented 5 years ago

So I think the issue with removing the sys.exit(0) call is that the current working directory is changed in create_batch.py: submit_script() calls write_single_job_file() which changes the cwd (current working directory). Relevant lines here.

A quick refactor would be to change those lines to the following:

job_file_name = "{0}.job".format(job_file_name)
with open(os.path.join(work_dir, job_file_name), "w+") as job_file:
    job_file.writelines(job_file_lines)
gravelcycles commented 5 years ago

The bigger issue though is that changing the working directory is a bad practice for us, since it can have hard to spot consequences). For example, as used here, a call to create_batch didn't just submit a job but it also changed the working directory. Someone using create_batch should not expect that to happen.

This is a good problem to think about. I think a good way to work around this is to avoid changing directory all together. Still having the current_directory as a variable is probably a good idea, but having it at the module level, not at the global level, is probably also a good idea. Let's move that discussion to a separate issue.

gravelcycles commented 5 years ago

I am wrong. os.chdir() is part of the issue, but isn't the complete story. I think @Ovec8hkin have a path to refactor process_rsmas.py so that this isn't an issue, but we should still deal with os.chdir().