Open stsievert opened 4 years ago
I would image that HTMap has a submit file it uses internally. Could that submit file be used to generate a debugging/development submit file? I think I'd like something with this:
import htmap
options = {...}
future = htmap.map(..., map_options=htmap.MapOptions(**options))
future.debugging_submit_file
future.debugging_submit_file
would specify my Docker image and transfer all the Python files HTMap needs. It'd hopefully have a comment detailing how to submit with condor_submit
.
This would enable debugging on HTCondor without needing to rent an EC2 instance to debug (my current solution). Would that be possible?
I really like this idea! But, I want to be careful about how we implement it. It would be possible to generate the submit description for a single component, but I'd prefer a solution that "keeps you inside Python", since the intent is to wrap up the low-level HTCondor operations behind Python(ic) APIs.
I'm thinking of something like...
htmap.interactive(func, args, kwargs, map_options=...)
which would then connect you to the job (i.e., put you in a shell) once it starts running. I'll ask around about interactive submits and condor_ssh_to_job
and see what's possible.
I'd prefer a solution that "keeps you inside Python", ... I'm thinking of something like..
It'd be great to launch the interactive job from Python! That'd remove a lot of the HTCondor details.
I primarily use these interactive jobs for developing a single script, and would use bash on this remote machine to run the script over and over. I'd probably use it like this:
submit2:~ $ ls
launch.py finished.py train.py
submit2:~ $ python
>>> import htmap
>>> htmap.interactive(map_options=...)
# hangs while job launches
remote-machine:~ $ ls
launch.py finished.py train.py
remote-machine:~ $ python train.py
# make edits to train.py
remote-machine:~ $ python train.py
remote-machine:~ $ exit
>>> # back on submit2
Is your feature request related to a problem? Please describe. I have a job that requires a GPU. It requires a GPU and specifies a Docker image. I need to debug on this image.
Currently, my solution is launch an EC2 machine, copy my files over then start developing/debugging.
Describe the solution you'd like I method to land an interactive job on this image. Something like https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#interactive-jobs
Describe alternatives you've considered
condor_submit
. However, that's inconvenient: I don't really want to write that file, and manage keeping my submit file andMapOptions
similar.