Open jcohen02 opened 8 years ago
Just to update the above issue, we have a workaround for this problem by first creating a saga.filesystem.Directory
object and then calling is_file('filename')
on this object to see if the target file that we wish to copy exists on the remote platform.
However, I also note that if the file exists and I then try to call open('filename')
on the Directory
object, this fails. Where other functions such as is_file
, is_dir
and make_dir
on the Directory
object seem to accept a relative path, open
appears not to be operating within the context of the Directory
object, e.g. given /tmp
on a remote node my.remote.host
containing a file test.txt
:
# The following two statements succeed
remote_dir = Directory('sftp://my.remote.host/tmp/', session=s)
remote_dir.is_file('test.txt') # returns True
# This fails
remote_dir.open('test.txt')
Looking at the internal state where saga-python is requesting a lease for the SSH connection when remote_dir.open('test.txt')
is called, it's passing the URL file://localhost/test.txt
.
If the full remote URL of the file is passed to the open call then the file can be opened successfully but it would be good if this could be done using a relative path.
Hey Jeremy,
thanks for the ticket! It looks like we don't release shells correctly on the failing ops, or something.
The second problem you reported (rel path interpreted as absolute) deserves a second ticket, as i seems quite unrelated. I'll open one if you don't mind.
Hey Andre, there doesn't seem to be a commit or PR associated with this ticket. Is it fair to assume the bug with incorrect shell release still exists?
Yes, I also assume that this not fixed. The lease manager has seen some updates wrt. garbage collection, but I doubt that this case is covered. We may want to confirm though.
I'll leave this open in that case. I think jeremy has already provided info to reproduce the issue and a workaround.
I'm experiencing the creation of stale SSH connections when attempting file transfers via the saga-python API which ultimately result in my application either blocking when trying to make an SSH connection or throwing a
"RuntimeError: LeaseObject is already leased:"
exception. I'm unsure whether this is a bug or something I'm doing wrong in the way I'm using the API.My application is using gevent and I originally thought the LeaseObject exception may be the result of some attempt at concurrent access to a shared session object, however I've now got a simple single-threaded standalone test case based on file copying that demonstrates the problem.
I wonder if someone can see if this issue can be reproduced or whether this is something specific to the platforms that I'm using. I'm using a Mac OS X client running Python 2.7.10 and using saga-python 0.40.1. I can reproduce the problem copying files from localhost or from a remote Linux server running Ubuntu 14.04.
The test case is as follows:
1) Create a group of files containing random data:
2) First test copying of data with valid filenames:
This copy works correctly and I observe a consistent group of 3 ssh and 1 sftp processes created by the script.
3) Now retry the copy using invalid filenames - in my use case, it is sometimes the case that a file copy will be attempted using a filename that does not exist:
Replace the
for
loop in lines 13-14 above with:Now run the script again with the revised loop. When the line
file_obj = File(source_file_url, session=s)
is run, a new SSH process is created but an exception is generated so execution jumps to the lineexcept DoesNotExist as e:
. At this point,file_obj
does not exist soclose()
cannot be called but the SSH connection remains - I think this is this now a stale SSH connection?After attempting around 10 copies of files further SSH connections cannot be made and the code hangs.
4) Now trying a scenario similar to my own code, where some copy tasks succeed and some fail:
Replace the
for
loop in lines 13-14 with:Again, I see a rapidly growing number of SSH connections when running the script but with this example, the script always fails after some number of file copy tasks with:
I'm using saga-python in a service environment and caching session/service objects so over time, the stale connections build up and I am eventually experiencing one of the above errors.
Any help or suggestions you can give in resolving or working around these issues would be much appreciated.
Thanks, Jeremy