htcondor / htmap

High-Throughput Computing in Python, powered by HTCondor
https://htmap.readthedocs.io
Apache License 2.0
31 stars 10 forks source link

CLI documentation is low information and has jargon #208

Closed stsievert closed 4 years ago

stsievert commented 4 years ago

Is your feature request related to a problem? Please describe. I'd like to stop all my submitted jobs and re-submit. I'm having some difficulty figuring out which command line option does that. Various options are:

$ htmap clean --help
Usage: htmap clean [OPTIONS]

  Clean up maps.

Options:
  --all       Remove non-transient maps as well.
$ htmap release
Usage: htmap release [OPTIONS] [TAGS]...

  Release maps.
  # ...

I find this documentation to be very low-information and not very clear. The --help flag doesn't add much information: in these two examples, the --help flag really only adds the word "map".

Additionally, what exactly does "release" and "clean" mean? I'm not familiar with HTCondor's jargon; to me, "release" and "clean" could both mean "remove all existing tags".

Describe the solution you'd like Some better documentation, describing the output the command has without using any HTCondor jargon. These two pieces of documentation are good examples:

$ htmap logs --help
  Echo the path to HTMap's current log file.

  The log file rotates, so if you need to go further back in time, look at
  the rotated log files (stored next to the current log file).
$ htmap vacate
Usage: htmap vacate [OPTIONS] [TAGS]...

  Force maps to give up their claimed resources.

To me, this is more clear. It clearly describes the output of the command without using any HTCondor jargon. I can clearly see that htmap logs will print a path to a logs file.

However, htmap vacate could still use some improvement; it still has some jargon. What happens after the resources are freed? Do the maps claim new resources?

stsievert commented 4 years ago

This low-information and jargon-filled documentation is common with the other CLI commands. Here's some cases I found:

$ # output trimmed to show command description
$ htmap clean --help
Clean up maps.
$ htmap hold --help
Hold maps.
$ htmap pause --help
Pause maps
$ htmap reasons --help
Print the hold reasons for maps.
$ htmap release --help
Release maps.
$ htmap rerun --help
Rerun (part of) a map.
$ htmap resume --help
Resume maps.
$ htmap retag --help
Retag a map.
$ htmap tags --help
Print tags.

In most of these examples, --help really only adds one word. Typically htmap foo --help takes the form "Foo maps" or "Foo a map".

JoshKarpel commented 4 years ago

Very good point!

stsievert commented 4 years ago

How do I completely clean my submit node? I want to immediately stop all jobs and start completely fresh. Currently, I'm doing this:

htmap vacate --all
htmap release --all
htmap pause --all
htmap remove --all
htmap clean --all

remove seems most relevant, but it's at the end because it takes a long time to complete (waiting for jobs to finish?). I need to enter a keyboard interrupt, then find the job ID 123f with condor_q and manually enter each job ID (it can't find all the jobs for 123, so I need to enter condor_rm 123.4).

JoshKarpel commented 4 years ago

htmap clean --all or htmap remove --all should end up with the same result. You don't need to preceeding commands. Both commands will wait for the jobs to exit "cleanly", meaning the jobs actually cleanly leave the queue. I've noticed that this sometimes doesn't work (probably we're somehow missing events in the event log, for reasons I haven't had time to dig into). In that case, run htmap remove --all --force to remove the local data storage without confirming that the jobs have actually left the queue.