Closed mristin closed 6 years ago
I'd be interested in seeing the implementation. I'd hope that those use-cases were fairly easy to implement without any additional functions e.g.
def read_bytes(shell, path):
with shell.open(path, "rb") as fileobj:
return fileobj.read()
so my inclination is to avoid putting extra complexity into the API.
P.S. if you don't want this functionality in spur, would you mind if I created a pypi package spurplus and added a spurplus.SshShell which would inherit from spur.SshShell?
Go ahead!
It's not much more complex than that; mostly some wrapping and more informative error messages.
Let me create the module spurplus, so you can keep the spur module as simple as possible, while all the extra complexity and wrapping goes to spurplus. Once it's up, you can have a look and decide if you want to merge some of the functions back to spur and I'd remove them from spurplus.
Hi Michael,
I wrote the first draft of spur+. Could you please have a look and tell me what you think: https://bitbucket.org/parqueryopen/spurplus/pull-requests/1/initial-version/diff ?
I suppose the main benefit of spur+, the way I see it now, is typing and pathlib.Path support (which we both use extensively in our code base) & some minor helpers like md5 and mkdir.
The reason why I introduced check_call and check_output was also to ensure type safety (since allow_error=True
introduces type ambiguity which would be a bit bothersome to resolve every time we call a remote command, and we do that often).
I have a few open questions:
what would be the best way to approach documentation in your opinion? Should I duplicate spur documentation or reference it somehow (if so, do you know what would be the most intuitive way to reference it)?
I open a sftp client and close it after every operation -- do you know if sftp client was meant to be used that way or should I use some kind of pool?
I couldn't figure out the type of shell_type
from SshShell constructor -- do you know what it is?
Could you please check if I got the return types of run() and spawn() correctly?
Thanks a lot for your help in advance!
The reason why I introduced check_call and check_output was also to ensure type safety (since
allow_error=True
introduces type ambiguity which would be a bit bothersome to resolve every time we call a remote command, and we do that often).
I haven't played around with static typing in python -- how does allow_error=True
introduce type ambiguity?
- what would be the best way to approach documentation in your opinion? Should I duplicate spur documentation or reference it somehow (if so, do you know what would be the most intuitive way to reference it)?
If you want to reference the docs for Spur, the easiest way would probably be a link with a relevant fragment (if you're referencing a specific part). It's probably best to make sure you're pointing at a specific version.
- I open a sftp client and close it after every operation -- do you know if sftp client was meant to be used that way or should I use some kind of pool?
As in a Paramiko SFTP client? Their docs are the best place to check, I don't know off the top of my head.
- I couldn't figure out the type of
shell_type
from SshShell constructor -- do you know what it is?
It needs a Boolean attribute supports_which
, and a method generate_run_command()
-- looking at the specific implementations should help you work out what the exact argument and return types are.
- Could you please check if I got the return types of run() and spawn() correctly?
I'm not familiar with the typing rules, but I'd expect the return type of run()
to be spur.results.ExecutionResult
, since spur.results.RunProcessError
gets raised, not returned.
Thanks a lot for the answers, I appreciate them!
I haven't played around with static typing in python -- how does allow_error=True introduce type ambiguity?
Sorry, I was tired and misread the code. It is actually raise
, not return
in results.result()
function. There is no ambiguity, the type is always spur.results.ExecutionResult
.
Regarding the paramiko's SFTP client: I have read http://docs.paramiko.org/en/2.4/api/sftp.html (and some blogs on internet), but I was not able to find out about the expected longevity of the sftp client.
In your code, you create a SFTP client whenever you call SshShell.open()
in spur/ssh.py
, line 251:
class SshShell(object)
# line 251
def open(self, name, mode="r"):
sftp = self._open_sftp_client()
sftp_file = SftpFile(sftp, sftp.open(name, mode), mode)
# ...
return sftp_file
# line 299
def _open_sftp_client(self):
return self._get_ssh_transport().open_sftp_client()
Since we never observed any problems with the performance, I suppose it's OK to always create a new SFTP client on every operation instead of keeping one connected all the time tying down the resources. Do you happen to remember -- was there any particular reason why you opened it on every open()
and did not create it at the initialization of SshClient
?
Regarding the documentation: I'll fix the version of spur dependency to spur==0.3.20
and copy the documentation starting by stating a reference to the original spur function.
Since we never observed any problems with the performance, I suppose it's OK to always create a new SFTP client on every operation instead of keeping one connected all the time tying down the resources. Do you happen to remember -- was there any particular reason why you opened it on every
open()
and did not create it at the initialization ofSshClient
?
I have no idea, I'm afraid -- I wrote it quite a long time ago.
Hi Michael,
Thanks again for all the help. If you want, you can have a look at the final implementation: https://bitbucket.org/parqueryopen/spurplus/pull-requests/1/initial-version/diff
Please let me know if you have any comments, ideas etc. I would be very grateful if you could review the README and see if that's OK with you what I wrote there. Is it OK if I use MIT license or should I use BSD-2?
I did a small benchmark to measure the overhead of opening/closing SFTP client on each operation. It's substantial (25x when using SSH server which runs on the same machine).
As soon as you give me a go, I'd publish it to pypi.
I assume you have spur as a dependency rather than copying the code? If so, then you can use whatever licence you like.
Yes, it is only a dependency (apart from copy/pasting parts of the method documentation with clear references to spur.py).
Is there nothing to be added to the readme? Should we then publish it to pypi?
Assuming you have no comments regarding the readme, I published the library to Pypi: https://pypi.org/project/spurplus/ https://bitbucket.org/parqueryopen/spurplus/
I would be very thankful if you could put a link to spurplus somewhere in your docs.
I implemented the functions get and put (file upload/download) and read_text, read_bytes, write_text, write_bytes and mkdir to mimic pathlib.Path wrapped around
spur.SshShell
to allow simpler manipulation of remote files.Would you be interested if I integrated them into spur and made a pull request?
For input arguments, I always used a union of
str
andpathlib.Path
.P.S. if you don't want this functionality in spur, would you mind if I created a pypi package spurplus and added a spurplus.SshShell which would inherit from spur.SshShell?