AgnostiqHQ / covalent-slurm-plugin

Executor plugin interfacing Covalent with Slurm
https://covalent.xyz
Apache License 2.0
27 stars 6 forks source link

adding sshproxy integration #52

Closed wjcunningham7 closed 1 year ago

wjcunningham7 commented 1 year ago
Andrew-S-Rosen commented 1 year ago

https://github.com/AgnostiqHQ/covalent-slurm-plugin/blob/32a88d959d911aaac9102e91fd78e8a85112ab20/tests/slurm_test.py#L86 should be executor.poll_freq == 60. And if you wouldn't mind changing https://github.com/AgnostiqHQ/covalent-slurm-plugin/blob/32a88d959d911aaac9102e91fd78e8a85112ab20/tests/slurm_test.py#L100 to something > 60 like 90 that'd be great!

Andrew-S-Rosen commented 1 year ago

Summarizing my comments from Slack:

  1. I'd recommend linking to here in the instructions so that the user knows where to make the OTP.

  2. I'd change "inspect the URL provided for a secret query string" to "inspect the provided URL for a string labeled secret="

  3. For the line "In order to enable this behavior in this plugin, add the following block to your Covalent configuration while the server is stopped", I'd recommend hyperlinking "Covalent configuration" to the docs here in case the user hasn't done this before.

  4. Something to think about is that if a user opens an issue on GitHub and shares their ct.get_config(), they might inadvertently share the password and secret if they aren't knowledgeable or careful. This isn't restricted to the SLURM plugin, I should note.

  5. The user has to have the sshproxy tool either in their PATH or in the current working directory. This could be easily overlooked and should be added to the docs.

  6. It might be worthwhile briefly mentioning that the user can include the ssh key and certificate from manually running sshproxy without relying on the sshproxy interface if they don't want to.

  7. The default for the "poll_freq" kwarg in SlurmExecutor() is 30 s (see here) but should probably be raised to 60 s so you don't get the warning every time

  8. The code is looking for sshproxy, but the file currently shipped by NERSC is sshproxy.sh (or sshproxy.exe for Windows). I had sshproxy.sh in my PATH and so the sshproxy command was not recognized. Perhaps sshproxy was what was shipped at an earlier time, because I vaguely remember using it without the .sh extension before.

  9. The line here will cause a TypeError if self.cert_file is None. This is a bug that needs to be fixed before release.