Open Abraxos opened 2 years ago
I was also able to reproduce the broken pipe issue with scp. I also see a few broken pipe messages in the ansible logs when running ansible with
-vvvv
. I have uploaded the ansible logs here: https://github.com/ntimo/warpgate-issue-459/tree/master/results/ansible_log_broken_pipe
Originally posted by @ntimo in https://github.com/warp-tech/warpgate/issues/459#issuecomment-1313936825
I think I might have solved it! At least, scp is not freezing up on my side anymore - wondering if this is going to help with #459 too!
@Eugeny seems like the build https://github.com/warp-tech/warpgate/actions/runs/3465220847/jobs/5787693867 failed :( I can't therefore try if the fix works. Could you maybe check the build? Thank you a lot. And also thanks for the quick response / fix.
I can confirm that the fix works and the scp copy of a 10GB file now works flawlessly. Thank you @Eugeny
Awesome! Thank you so much.
Not trying to hurry, but just for my own time estimates, when do we believe that version 0.6.5 will be available?
@Abraxos I've just pushed 0.6.5: https://github.com/warp-tech/warpgate/releases/tag/v0.6.5 :v:
I am sorry to report but I tested the new version 0.6.5 and i am still getting the same issue. SCP still crashes (though it lasts maybe a few seconds longer) and for some reason in the webUI it still lists the version as 0.6.4
Perhaps there was some kind of issue in version creation?
(and yes, I made doubly sure that I am downloading the right binary, its this one: https://github.com/warp-tech/warpgate/releases/download/v0.6.5/warpgate-v0.6.5-x86_64-linux)
... still getting the same issue. SCP still crashes
Eugeney reopend this issue
@Abraxos same issue is in combination with mc
? [Yes/No]
Yes: State explicite that mc
is in play.
No: Make a fresh issue which documents that plain scp
through warpgate
fails.
Regards Geert Stappers
P.S. My github profile has documented how to contact me outside of github.
Yes, the same issue still happens with mc
. My apologies, I neglected to clarify.
I re-installed the newest binary, which appears different from the one I got yesterday, and the issue persists with mc
I happened to run mc
this time over an SSH connection managed by the same warpgate instance that the transfer was going through. At the moment when the mc
transfer failed, I also lost the SSH connection. It looked like something happened on the warpgate end causing it to disconnect all sessions, but I don't see anything in the log:
The same thing (with all sessions apparently getting dropped) is happening with SCP as well. To replicate this issue (at least for me) its sufficient to SSH into a machine using the warpgate host, then start an scp
/mc
transfer of a large file from another machine also through the warpgate host. After 10-15s you will get disconnected. If you use tmux
to keep the shell alive, you will discover that the mc
/scp
transfer also failed at the same time.
The good news though is that the version appears correct in the WebUI.
Thanks for checking - that binary update was just for the version number. No news on a fix yet.
That's OK, thank you for keeping me informed
I tried version 0.7.0 and same issue unfortunately with both scp
and mc
so far. No pressure, just updating for consistency.
I also just noticed that when you copy large files using scp warpgate created huges recordings for this, not sure if this is related.
I also just noticed that when you copy large files using scp warpgate created huges recordings for this, not sure if this is related.
No, that's not a factor in this situation. The original message explicitly states that recording is disabled.
Just checking in, has there been any progress on this issue? This is basically the only thing keeping me from using warpgate for ALL my SSH
I think I have this issue when using warpgate for port-forwarding, eg, forwarding a port for a proxy- Usually the connection becomes more and more belaboured, until the terminal is unresponsive, and the forwarded port no longer functions.
At that point, I have to exit the session using the SSH escape sequence (enter
, ~
, .
), as ctrl+D
interrupts no longer work.
Once I reconnect, I get a less languished connection, for a while, and then the slowness re-emerges.
Marginally related - I wish it were possible to disable recordings on some role-basis, so I could disable them for more "service" accounts, rather than shell accounts.
I have this same issue and logged my debug results in this discussion: https://github.com/warp-tech/warpgate/discussions/415
I still have this problem daily and have to reconnect to make the port forwards work often.
If I don't use port forwards, all is fine...
I've fixed one particularly egregious padding calculation bug in russh
and bumped it here - could you give the latest main
branch a try?
Oh man that sounds like exactly the kind of thing that causes the symptoms I originally experienced. Lemme see if I can test it tonight P:
Is it by any chance in the nightly build yet?
I tried the nightly version and the effect is still the same. It cannot transfer files (or potentially SSH session contents) greater than about 1.4GiB
β― scp sample.txt syncserver.external:/home/.../Sync/
sample.txt 6% 1365MB 10.3MB/s 30:48 ETAc
lient_loop: send disconnect: Broken pipe
lost connection
β― scp sample.txt syncserver.external:/home/.../Sync/
sample.txt 7% 1437MB 6.4MB/s 49:45 ETAc
lient_loop: send disconnect: Broken pipe
lost connection
Looking through the logs of my instrumentation system, it says that the process gets killed with a SIGKILL suggesting that the OS killed the process after running out of memory. So I spun up htop
while running the transfer one more time and watched as warpgate
consumed all the available memory on the machine running it (1GB) and then all the available swap (512M) and then promptly got killed. The RAM consumption perfectly mirrors the amount of data that has been transferred. So that's where the 1.4GB thing above is coming from.
It seems like there is still something that is saving the contents of the SSH file transfer into memory even though recording is disabled. If we fix that, we will fix this issue.
So uh... weird note, I recently re-installed the OS on the server that was running warpgate and upgraded the OS to Ubuntu server 24.04. I decided to run a test, and the issue just kinda went away. Like I was able to first transfer a 21GiB file without observing RAM consumption going up (the server had 24GiB orf RAM) and then a 33GiB file transferred just fine.
I have no clue what could've changed, especially since I use configuration management to back up and copy the configuration database, so the configuration for warpgate is exactly the same.
Either way, so far as I can tell, this issue is not happening on Ubuntu 24.04.
Hello,
I originally imagined that I had found a speed issue, but it turned out that speed does not appear to have been the core problem. I am running the most recent release version 0.6.4. I attempted to provide example data by comparing direct SSH connections for file transfers with a proxyjump configuration, and a warpgate configuration. I first tried it with
rsync
and everything worked perfectly, the speed was basically the same across all three.I then tried to transfer files with
mc
and while I got speed readings for direct and jumphost, the warpgate connection would always error out, pretty quickly I might add:(yes, my name is Eugene, just like the author of this software =D)
I then figured that maybe its something wrong with the way that
mc
connects, and attempted to usescp
to transfer the file instead, except I got the same kind of behavior:and this happens pretty reliably. I was not able to transfer a 10GB random file across a Warpgate host. Please advise. Thank you.
P.S. Recording is disabled on my Warpgate instance P.P.S. As always, all my bug reports are done exclusively to improve this software. I really like it, and I appreciate all the devs' work. Nothing I say should be construed as ungrateful, insulting, or demeaning. Thank you for writing Warpgate, I really appreciate its existence and continued development.