Closed leonnis12 closed 3 months ago
This would be very helpful for my use case as well
Thanks for the PR.
I am somewhat weary about exposing the retry mechanism timeout as a public API as I feel like the implementation is a bit of a bandaid over the real problem rather than a good solution. I know I've been dragging my heels on trying to rework the TCP transport side where my aim was to improve this situation but I'm on the fence about accepting this type of change.
The proper solution in my mind is to figure out how to detect when the socket has been "dropped" but potentially in your situation the client still thinks its connected and thus will wait for the full timeout.
I understand your concern... And yes the best solution is to know exactly when the connection has been dropped but a logic on the client side to cut off the connection after timeout is not bad either. I mean you can have this as backup if you cannot detect the dropped connection If you do not want to expose the timeout as a public api, which is a low level configuration that can mess up things, what do you say about a environment variable marked as experimental until you have a better solution?
I think I can justify an environment variable here, especially if it's helping in some situations. I would have to stress it wouldn't be covered under the public API and could change/be removed in the future.
The socket stuff can be complicated, if you know the reason why the socket was closed but never reported back to the smbprotocol then I'll love to see a reproducer for it. That'll help me test out new scenarios in the future and hopefully avoid the need for the timeout altogether.
I agree that the proper solution would be to detect when the socket has been "dropped", but also that the environment variable "fix" should be enough to help in the situations where this happens, until a proper solution can be found at least.
I've updated the pull request to not expose the timeout through the public API, and instead be available to be overridden through an environment variable marked as experimental from the Connection class.
Regarding reproducing the issue, simulating a network that drops packets from time to time, or has random response time spikes should be enough to trigger this issue, for further testing in the future. The following is an example from the network where I encountered the issue constantly.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.04%. Comparing base (
2ce49ef
) to head (8e16e2b
). Report is 1 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I'll need to sort out the CI problems on macOS separately, I believe the image version now uses the arm64 builds so the x86_64 ones are failing without further work needed. That's not your problem to deal with so thanks for the changes and agreeing to the env var solution for now!
Would it be possible to make a minor release which includes this fix, until the changes for 1.14 are ready? It would be really helpful.
v1.14.0 has been published on PyPI with this change https://pypi.org/project/smbprotocol/1.14.0/.
This PR extends the configurability of the SMB Connection class and register_session methods to allow modifying the transport.recv timeout used for both keep-alive checks that keep Windows from closing a connection at ~16 minutes, and for checking the health of the connection.
In the usecase of my application, the SMB connection is used over an unstable network, which sometimes loses the connections between the machines. This triggers Connection timed out errors from the health check implemented in _process_message_thread, but only after 10+ minutes (the hardcoded timeout) where the application hangs. I would like to be able to configure this timeout to be smaller for use cases like this, to perform the health check more often, and be able to handle the timeout error in a reasonable ammount of time.
Current behaviour:
Expected Behavior
This might also be a fix for https://github.com/jborean93/smbprotocol/issues/117 (mentions a similar problem), by allowing configuring a smaller timeout value, which does not keep the application stuck for long and allows the error to be handled in my application in a timely manner.
The PR extends the health checks added in https://github.com/jborean93/smbprotocol/pull/135 with the configurable health check interval value.