gridcf / gct

Grid Community Toolkit
Apache License 2.0
46 stars 30 forks source link

HPN-SSH not present in EPEL builds #108

Open asdorsey opened 4 years ago

asdorsey commented 4 years ago

Apologies in advance if you guys don't do the builds for EPEL. I have a question that I hope you can answer.

We recently deployed some data transfer nodes on CentOS 7.6 using the (at the time) latest available GCT packages in EPEL. It appears the gsi-openssh-server package that was installed (gsi-openssh-7.4p1-4.el7.x86_64) doesn't include the HPN patch - attempting to enable the feature in /etc/gsissh/sshd_config results in an error from gsisshd and the service refusing to start.

Oct 24 19:09:03 hdtn1 systemd: Starting Cluster Controlled gsisshd...
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config line 47: Deprecated option RSAAuthentication
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config line 56: Deprecated option RhostsRSAAuthentication
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config line 149: Unsupported option DisableUsageStats
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config: line 189: Bad configuration option: HPNDisabled
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config: line 196: Bad configuration option: HPNBufferSize
Oct 24 19:09:03 hdtn1 gsisshd: /etc/gsissh/sshd_config: terminating, 2 bad configuration options
Oct 24 19:09:03 hdtn1 systemd: gsisshd.service: main process exited, code=exited, status=255/n/a
Oct 24 19:09:03 hdtn1 systemd: Failed to start Cluster Controlled gsisshd.
Oct 24 19:09:03 hdtn1 systemd: Unit gsisshd.service entered failed state.
Oct 24 19:09:03 hdtn1 systemd: gsisshd.service failed.

I found #72 that references a discussion about dropping the HPN patch, but I can't find anything stating that this change was definitely made, and on what date that change was implemented.

Do you have any information on when HPN was removed from the EPEL package builds, and/or what was the last version that included HPN support?

matyasselmeci commented 4 years ago

@ellert, do you know? I took a brief look through the Fedora Koji but didn't find anything obvious in the changelog or the build logs.

msalle commented 4 years ago

Looking back at the thread starting with https://mailman.egi.eu/pipermail/discuss/2017-November/000100.html (and the later one starting https://mailman.egi.eu/pipermail/discuss/2018-September/000172.html), it looks like the EPEL version never had the HPN patch, only the globus-toolkit version did. Following the thread: when we started with the gridcf we decided to go with the EPEL versions and therefore also drop the HPN patch for the (now) gct. I have no idea how much work it would be to adapt it to work with the EPEL version.

asdorsey commented 4 years ago

Thanks for the updates.

I've made a half-baked attempt at getting the HPN patch into the GCT gsi-openssh package. I added the HPN-SSH patch for OpenSSH 7.4p1 as the last patch applied and modified the patch to work with the other patches in the source package. It compiles but I get a segfault in cipher-ctr-mt.c when attempting data transfers, so something is broken.

I'm not very experienced in C, so if someone else wants to give it a try I would be grateful. The modified patch is attached.

openssh-7_4_P1-hpn-14.12.modified.diff.txt

rapier1 commented 4 years ago

Hi, I am the developer of HPN-SSH. One of my old colleagues who is now at NOAA just contacted me today about gsi-openssh. First I want to start by saying that I had let the hpn-ssh patches fall way behind for various budget and life related issues. However, I've seen ported everything up to OpenSSH 8.1p1. I've also fixed some problems with the multithreading aes-ctr cipher, server logging, and few of formatting issues. I've back ported that fix to 7.6p1 - 8.1p1 inclusive.

I'm very interested in ensuring the hpn-ssh remains a part of gis-openssh and would like to help in the process if anyone would like.

I have grabbed the package files for gsissh from the fedora sources and applied my patches. Of course, it's not building because of some issue with LDAP. That said, not even the unpatched version is building because of the same LDAP issues. Probably my environment.

Anyway, I'm happy to answer questions, take feature requests, and deal with bugs. Just let me know what I can do to help out.

rapier1 commented 4 years ago

@adorsey-NOAA, @msalle So it turns out that I was getting wrong set of package files for Fedora. I grabbed the right source RPM (8.1p1 from https://koji.fedoraproject.org/koji/buildinfo?buildID=1403143) this time and was able to apply my patch. It builds and passes all of the regression and unit tests. I haven't tested it for full functionality at this point but I'll hand it over to the people at work who understand globus better than I to test that out shortly. If you want the patch I've attached it below. I've also included the spec file. This will only build against openssl 1.1 due to requirements inherited from libglobus. If you need it for an older version of openssl let me know and I'll do what I can

openssh-8.1p1-hpnssh.patch.txt gsi-openssh.spec.txt

fscheiner commented 4 years ago

Hi, I am the developer of HPN-SSH. One of my old colleagues who is now at NOAA just contacted me today about gsi-openssh. First I want to start by saying that I had let the hpn-ssh patches fall way behind for various budget and life related issues. However, I've seen ported everything up to OpenSSH 8.1p1. I've also fixed some problems with the multithreading aes-ctr cipher, server logging, and few of formatting issues. I've back ported that fix to 7.6p1 - 8.1p1 inclusive.

I'm very interested in ensuring the hpn-ssh remains a part of gis-openssh and would like to help in the process if anyone would like.

That's great news @rapier1 and very welcome!

@adorsey-NOAA, @msalle So it turns out that I was getting wrong set of package files for Fedora. I grabbed the right source RPM (8.1p1 from https://koji.fedoraproject.org/koji/buildinfo?buildID=1403143) this time and was able to apply my patch. It builds and passes all of the regression and unit tests. I haven't tested it for full functionality at this point but I'll hand it over to the people at work who understand globus better than I to test that out shortly. If you want the patch I've attached it below. I've also included the spec file. This will only build against openssl 1.1 due to requirements inherited from libglobus. If you need it for an older version of openssl let me know and I'll do what I can

CentOS 6 and 7 (and I assume RHEL and Scientific Linux 6 and 7) have OpenSSL 1.0.1[...] and 1.0.2[...] and (GSI-)OpenSSH 5.3p1 and 7.4p1 respectively.

~Will the older HPN patches from SourceForge work with these? If yes, they'll most likely lack the fixes for the problems you mentioned above, right? So support for these versions would be very useful, too until theses OSes are EOL. What do you think, would that be possible?~

Ok, just again had a look on SourceForge and the files there have been updated recently. Reading this:

Important News: Versions 14v15 for OpenSSH 7.6 through version 14v18 for OpenSSH 7.8 had bug in the multithreaded AES-CTR code that would cause occasional hangs. We believe we've identified and fixed this problem. If you run into any issues please contact at hpn-ssh@psc.edu. We can't fix problems we don't know about so we are counting on you.

...on SourceForge I conclude that the above mentioned problems were not affecting older versions of the HPN-Patches (meaning specifically the patches for OpenSSH 5.3p1 and 7.4p1)?

BTW, I'm in the process of creating GSI-OpenSSH packages for SUSE. I started with packages for OpenSUSE Leap 15.0 which uses OpenSSH 7.6p1 and will also try to integrate the HPN-Patches for OpenSSH 7.6p1 there now. Much obliged for providing these.

fscheiner commented 4 years ago

@rapier1

BTW, I'm in the process of creating GSI-OpenSSH packages for SUSE. I started with packages for OpenSUSE Leap 15.0 which uses OpenSSH 7.6p1 and will also try to integrate the HPN-Patches for OpenSSH 7.6p1 there now. Much obliged for providing these.

Hm, OpenSUSE Leap 15.0 has OpenSSL 1.1.0[...]. Should the patches from https://sourceforge.net/projects/hpnssh/files/Patches/HPN-SSH%2014v15%207.6p1/ then work at all there? The summary on SourceForge says:

Native OpenSSL 1.1 compatibility is included with OpenSSH 7.9 an on. HPN-SSH 14v18 and on are also compatible with OpenSSL 1.0.1.

...so maybe not?

rapier1 commented 4 years ago

@fscheiner Unfortunately getting versions of OpenSSH before 7.9 to build with OpenSSL 1.1 is a bit of a pain in the ass. I did it for 7.7p1 and 7.6p1 as an exercise but it's a tangled mess of ifdefs. I wouldn't suggest it unless it's an absolute necessity as maintenance is going to be a issue.

Also, it turns out that the SRPM I grabbed from https://kojipkgs.fedoraproject.org//packages/gsi-openssh for 8.1p1 fails when you try to do a globus auth. Note: this is after I applied the hpn-ssh patch so there might be a weird interaction that I'm not understanding. That said, the hpn-ssh patch doesn't touch the buf. Anyway, it crashes in sshbuf.c at

Program terminated with signal 11, Segmentation fault.

0 sshbuf_reset (buf=buf@entry=0x0) at sshbuf.c:176

176 if (buf->readonly || buf->refcount > 1) { (gdb) bt

0 sshbuf_reset (buf=buf@entry=0x0) at sshbuf.c:176

1 0x0000556dae6c140b in ssh_gssapi_buildmic (b=b@entry=0x0, user=user@entry=0x556dae6e64d4 "", service=0x556dafb63af0 "ssh-connection",

context=context@entry=0x556dae6e057a "gssapi-keyex") at gss-genr.c:503

2 0x0000556dae689182 in userauth_gsskeyex (ssh=) at auth2-gss.c:90

3 0x0000556dae675c0a in input_userauth_request (type=, seq=, ssh=0x556dafb717c0) at auth2.c:408

4 0x0000556dae6b97e9 in ssh_dispatch_run (ssh=ssh@entry=0x556dafb717c0, mode=mode@entry=0, done=done@entry=0x556dafb634a0) at dispatch.c:113

5 0x0000556dae6b9839 in ssh_dispatch_run_fatal (ssh=ssh@entry=0x556dafb717c0, mode=mode@entry=0, done=done@entry=0x556dafb634a0) at dispatch.c:133

6 0x0000556dae67469d in do_authentication2 (ssh=ssh@entry=0x556dafb717c0) at auth2.c:184

7 0x0000556dae6640db in main (ac=, av=) at sshd.c:2262

rapier1 commented 4 years ago

@fscheiner

Also, is there a canonical set of patches, source code, srpms, etc that I should focus on? I'm largely focused on providing support to CentOS because that's what my community generally uses. However, I'm game for helping out on any of these but I need a clue as to where to start.

Thanks!

fscheiner commented 4 years ago

@rapier1

@fscheiner Unfortunately getting versions of OpenSSH before 7.9 to build with OpenSSL 1.1 is a bit of a pain in the ass. I did it for 7.7p1 and 7.6p1 as an exercise but it's a tangled mess of ifdefs. I wouldn't suggest it unless it's an absolute necessity as maintenance is going to be a issue.

I understand. Seems like I have to switch to Leap 15.1 (which has OpenSSH 7.9p1) anyhow according to their lifetime page I just checked. It was convenient to start building a GSI-OpenSSH package on Leap 15.0 as I already had a VM running it. I have to look into how to upgrade this installation to Leap 15.1. One question: As the HPN patches for OpenSSH 7.9p1 are shipped from SourceForge as multiple patches, is there some ordering needed when applying them, or can I just concatenate and apply them as single patch?

Also, it turns out that the SRPM I grabbed from https://kojipkgs.fedoraproject.org//packages/gsi-openssh for 8.1p1 fails when you try to do a globus auth.

Just to be sure, with "globus auth" you mean with GSI proxy credential, because there's also a service from Globus called Globus Auth? And what failed, the gsisshd or gsissh? And does the authentication work for the untouched package at all - as it's pretty new?

@fscheiner

Also, is there a canonical set of patches, source code, srpms, etc that I should focus on? I'm largely focused on providing support to CentOS because that's what my community generally uses. However, I'm game for helping out on any of these but I need a clue as to where to start.

Adding @ellert here, as he should know best. AFAICT for EPEL/Fedora there's only the GSI enabling patch that's shipped with the GSI-OpenSSH source RPMS (available from the above mentioned URL for example). I also use this patch for the SUSE packages (with some reordering). I assume with HPN patches available for the currently maintained RHEL 6, 7 and 8 compatible OSes, for the future you only need to follow up on the most current Fedora versions of GSI-OpenSSH which seems to be always based on the most recent OpenSSH version. This should also provide us with patches for the future RHEL compatible OSes, as we should be able to just reuse the HPN patch(es) from Fedora's GSI-OpenSSH that will match the (GSI-)OpenSSH used in future RHEL compatible OSes.

@matyasselmeci @msalle: What's your opinion on that?

asdorsey commented 4 years ago

Just to be sure, with "globus auth" you mean with GSI proxy credential, because there's also a service from Globus called Globus Auth? And what failed, the gsisshd or gsissh? And does the authentication work for the untouched package at all - as it's pretty new?

I'm helping@rapier1 with testing. The following is just for clarification, as he is working on another approach that I hope to test later today.

CentOS 7.6 RPMs based on the source RPM from https://kojipkgs.fedoraproject.org//packages/gsi-openssh with the HPN-SSH patch added When using GSI proxy credentials to authenticate, gsisshd segfaults. Stack trace follows.

Core was generated by `gsisshd: Adam.Dorsey [ne'.
Program terminated with signal 11, Segmentation fault.
#0  sshbuf_reset (buf=buf@entry=0x0) at sshbuf.c:176
176             if (buf->readonly || buf->refcount > 1) {
(gdb) bt
#0  sshbuf_reset (buf=buf@entry=0x0) at sshbuf.c:176
#1  0x0000556dae6c140b in ssh_gssapi_buildmic (b=b@entry=0x0, user=user@entry=0x556dae6e64d4 "", service=0x556dafb63af0 "ssh-connection",
    context=context@entry=0x556dae6e057a "gssapi-keyex") at gss-genr.c:503
#2  0x0000556dae689182 in userauth_gsskeyex (ssh=<optimized out>) at auth2-gss.c:90
#3  0x0000556dae675c0a in input_userauth_request (type=<optimized out>, seq=<optimized out>, ssh=0x556dafb717c0) at auth2.c:408
#4  0x0000556dae6b97e9 in ssh_dispatch_run (ssh=ssh@entry=0x556dafb717c0, mode=mode@entry=0, done=done@entry=0x556dafb634a0) at dispatch.c:113
#5  0x0000556dae6b9839 in ssh_dispatch_run_fatal (ssh=ssh@entry=0x556dafb717c0, mode=mode@entry=0, done=done@entry=0x556dafb634a0) at dispatch.c:133
#6  0x0000556dae67469d in do_authentication2 (ssh=ssh@entry=0x556dafb717c0) at auth2.c:184
#7  0x0000556dae6640db in main (ac=<optimized out>, av=<optimized out>) at sshd.c:2262

@rapier1 is working on an HPN-SSH patch for a gsi-openssh source RPM from https://kojipkgs.fedoraproject.org//packages/gsi-openssh/7.6p1/5.fc28.1/src/ . I've already built this package on CentOS 7.6 and successfully tested GSI authentication.

rapier1 commented 4 years ago

@fscheiner @ellert

I'll write more soon but @adorsey-NOAA just told me that he was able to successfully build, deploy, and use the hpn-ssh patches for OpenSSH 7.6p1 under Centos 7. I'm including a link to the rpms and srpm for this. This was built under OpenSSL 1.0.2k. You'll find openssh-7.6p1-hpnssh.patch in the SOURCES directory of the srpm. I apply this patch after all of the other patches as it seemed easiest to do it that way. I'll start working on 7.7p1 and 7.8p1 shortly. As for 7.9p1 and later I'll need to update my globus environment and start poking at what's going on in sshbuf.c. I think that's some sort of weird interaction with the openssh-8.0p1-gssapi-keyex.patch.

Also, I really don't know anything about globus so I am sorry if I use the wrong terms for things at times. As I move forward I expect I'll be picking a lot of this up.

https://www.dropbox.com/sh/odv0rv58x8tgeou/AADyZMqHW77O3ZopdSv96MRca?dl=0

asdorsey commented 4 years ago

We deployed the packages that @rapier1 built for us onto our data transfer nodes earlier this week. This solved a large part of the data transfer performance issues that our users were complaining about.

I (and the users of our data transfer systems) would love to see this work end up in the EPEL packages if that's at all possible. I'm not great at C, but please let me know if I can help in other ways like testing packages on our test environment.

fscheiner commented 4 years ago

@adorsey-NOAA @rapier1

We deployed the packages that @rapier1 built for us onto our data transfer nodes earlier this week. This solved a large part of the data transfer performance issues that our users were complaining about.

So you and your users were able to successfully test both HPN enabled GSI-OpenSSH server and client. That's good to hear. If at all possible, I'd recommend to use the (GSI-)OpenSSH version of the respective EPEL version for your OS - EPEL7, right? - as that version is still maintained in EPEL and you'll benefit from any (security) updates that are issued - after rebuilding the package.

I (and the users of our data transfer systems) would love to see this work end up in the EPEL packages if that's at all possible. I'm not great at C, but please let me know if I can help in other ways like testing packages on our test environment.

Testing new packages will surely be helpful. I think for EPEL we would need the HPN patches for the respective (GSI-)OpenSSH versions:

I think EPEL7 would be most important ATM.

@rapier1 I'll give that 7.6p1 HPN patches a try in my GSI-OpenSSH package for OpenSUSE Leap 15.0. I'm not yet finished with the GSI-OpenSSH package for Leap 15.1 as it was a lot of work to adapt the patches from Fedora to the OpenSUSE version of OpenSSH - though a manually built gsissh client from the adapted OpenSSH source from Leap 15.1 already works.

rapier1 commented 4 years ago

@fscheiner I'll work on EPL7. Mostly I picked 7.6p1 out of hat (and because I don't think it included the OpenSSL 1.1 patch). As I develop new ones I'll put them up at https://sourceforge.net/projects/hpnssh/. I'm out all next week but I'll try to get to 7.4 tomorrow.

rapier1 commented 4 years ago

@fscheiner Turns out that building for EPEL7 (7.4p1) really didn't take much time at all. I was able to get it to patch and compile but I don't have the place to do the functional testing right now. As soon as I have it tested I can pass the srpm to whoever you think best.

I may want to back port some of the changes I made to the multithreaded cipher as well but this should be a good start.

fscheiner commented 4 years ago

@rapier1 Small update from my side - don't feel pressed to answer before you're back :-):

I patched the openSUSE Leap 15.1 OpenSSH 7.9p1 package with GSI and HPN patches. The resulting (gsi)ssh binary works to connect to a GSI enabled sshd.

For HPN I used https://sourceforge.net/projects/hpnssh/files/Patches/HPN-SSH%2014v18%207.9p1/openssh-7_9_P1-hpn-14.18.diff/download. From looking into the RPM source package on https://www.dropbox.com/sh/odv0rv58x8tgeou/AADyZMqHW77O3ZopdSv96MRca?dl=0 this patch came close to the one you used. But as there are other patches as well and all together are much bigger than the one I used, I'd like to ask if that was the correct patch to use?

And is there a quick way to determine that the HPN features are working? From:

[...]
The HPN-SSH team (Ben Bennet and Mike Tasota) also developed a multi-threaded variant of the
AES-CTR cipher so as to allow multicored systems to distribute the burden of computing the 
keystream over multiple cores. This enhancement produces a cipher stream that is 
indistinguishable from the default AES-CTR cipher stream. The upshot of this being that it is 
backwards compliant with all existing AES-CTR implementations - no need to have the 
multithreaded variant on both sides of the connection. [...]

..I assume it's sufficient to have an HPN enabled client to at least test the multithreaded AES-CTR cipher.

rapier1 commented 4 years ago

@fscheiner Hey, I'm sorry it's taken so long to get back to you. Things have been hectic (and infected by my kids with creeping crud). As for which patch to use - I'd use the one from sourceforge. The one in dropbox may or may not be somewhat out of date as I'm not monitoring that one.

There are multiple patches on sourceforge so I should probably explain that each of them implements a different set of features. So the AES-CTR only provides the multithreaded cipher, the Server Logging patch only includes the extended server logging, etc. The one you downloaded incorporates all available features in to a single patch, so that include the non cipher switch, the dynamic window scaling, server logging, multithreaded aes-ctr, and a patch to the scp progressmeter to show the 1 sec throughput as well as the averaged throughput. So if you want to include everything just use the patch you used.

As for functional testing - I keep meaning to build a script for that. Anyway, you can test the multithreaded cipher from the client side since the outbound encrypted data is the identical to the nonthreaded aes-ctr cipher. If you increase the verbosity you should get a quick rundown on the number of hits and waits in the keystream threads. Likewise if the verbosity is increased even more you'll see receive buffer adjustments - that increases pretty quickly though so you it will help to be on a high BDP path. To test the none-switch you do need to have that enabled on both ends of the connection. You'll get a warning once the none cipher is engaged but the only way to really test it is to do a raw packet capture and see if you can read the payload.

Also, I should have a version of 7.4 for CentOS 7 up shortly. I am having some problems with building 7.8 for CentOS 7 but I'm not going to spend a lot of time on that. Once I have a functional CentOS 8 environment I'll start working on 7.8 there.

A lot of the delay is just getting the environments set up. I have a coworker who generally does that but between SC19 and TechEx and Thanksgiving and etc he's been a little oversubscribed.

Lastly, is there anyone else I should be talking to about hpn-ssh? I'd like to do whatever is necessary to maintain its value for the community.

fscheiner commented 4 years ago

@rapier1

@fscheiner Hey, I'm sorry it's taken so long to get back to you. Things have been hectic (and infected by my kids with creeping crud)

No issue at all. I'm just happy that you found the time to continue the work on the HPN patches. That's greatly appreciated. :-)

There are multiple patches on sourceforge so I should probably explain that each of them implements a different set of features. So the AES-CTR only provides the multithreaded cipher, the Server Logging patch only includes the extended server logging, etc. The one you downloaded incorporates all available features in to a single patch, so that include the non cipher switch, the dynamic window scaling, server logging, multithreaded aes-ctr, and a patch to the scp progressmeter to show the 1 sec throughput as well as the averaged throughput. So if you want to include everything just use the patch you used.

Thank you, that's useful information. I also made progress with packages for SUSE: Lightly tested packages including the HPN and GSI patches for OpenSSH 7.9p1 are now available for SLES15 (SP1) and OpenSUSE Leap 15.1 (https://build.opensuse.org/package/show/home:frank_scheiner:gct/gsi-openssh).

As for functional testing - I keep meaning to build a script for that. Anyway, you can test the multithreaded cipher from the client side since the outbound encrypted data is the identical to the nonthreaded aes-ctr cipher. If you increase the verbosity you should get a quick rundown on the number of hits and waits in the keystream threads. Likewise if the verbosity is increased even more you'll see receive buffer adjustments - that increases pretty quickly though so you it will help to be on a high BDP path. To test the none-switch you do need to have that enabled on both ends of the connection. You'll get a warning once the none cipher is engaged but the only way to really test it is to do a raw packet capture and see if you can read the payload.

Ok, I'll have a look into this.

Also, I should have a version of 7.4 for CentOS 7 up shortly. I am having some problems with building 7.8 for CentOS 7 but I'm not going to spend a lot of time on that. Once I have a functional CentOS 8 environment I'll start working on 7.8 there.

Yeah, supporting a 7.8 on CentOS 7 would anyhow require constant "backporting" of patches from CentOS 8 instead of just using the patches for the 7.4 maintained in CentOS 7.

Lastly, is there anyone else I should be talking to about hpn-ssh? I'd like to do whatever is necessary to maintain its value for the community.

I don't know of any specific party, but maybe a mail to our discuss@gridcf.org mailing list would possibly reach a good portion of interested people. Posts to the list are moderated until you subscribe (details on gridcf.org, archive on https://mailman.egi.eu/pipermail/discuss/). I can approve both. Because of the upcoming holidays, you maybe should not post before after the holidays. BTW, nice holiday time and - if we don't hear from each other earlier - until next year. :-)

asdorsey commented 4 years ago

I hope everyone is doing well currently with the ongoing chaos.

I was wondering if there had been any progress on this issue, and if there was anything I could do to help.

EDIT: @rapier1 We've just noticed an issue with the test set of packages that you built for us a while ago. We're seeing messages like the following:

Apr  6 10:03:20 hdtn1 gsisshd[375932]: Disconnected from P\211I{\376\177

It looks like the IP that's supposed to be in that message is malformed somehow. Please let me know if I can provide more information.

rapier1 commented 4 years ago

I don't think that's specifically from the hpn-ssh code but I can give it a look and see what might be happening. My guess is that there is a malformed printf somewhere.

msalle commented 4 years ago

Hi, this might be tricky to debug. I just had a look and Disconnected from probably comes from https://github.com/openssh/openssh-portable/blob/master/packet.c#L1867. In older version that was part of sshpkt_fatal() itself, e.g. in the 7.5, packet.c#L2124. I tried tracing back to where it goes wrong, but I can't see an obvious problem: remote_id traces via fmt_connection_id(), probably ssh_remote_ipaddr(), get_peer_ipaddr() to get_socket_address() in canohost.c#L68. It's all part of upstream openssh, but that doesn't mean that the corrupted data is also caused by upstream.

rapier1 commented 4 years ago

Sorry I didn't get back to you sooner. I let this slide off of my plate because I wasn't able to replicate the problem on my end - which doesn't mean it's not real. How often do you run into this issue? Is it every time? Frequently? Occasionally? Does it happen with a specific remote host or have you seen it on multiple hosts? Lastly, do you know if this is via IPV4 or IPv6?

As an aside - I got new funding to develop hpn-ssh so I will have more time to focus on these issues and roll out improvements.

asdorsey commented 4 years ago

The issue occurs every time a user disconnects from a DTN, so every time that "DIsconnected from..." message is printed, the output is garbled. The DTNs are IPv4 only.

This is happening with the gsi-openssh-7.6p1-5 RPMs you gave me in November 2019. If you have a different or newer set of RPMs, I could test those on our test system and see if they have this issue as well. You mentioned in an earlier comment on this issue that you had packages based on OpenSSH 7.4 as well.

rapier1 commented 4 years ago

Let me go back and check something. I was talking to the gsi people and I think they picked up the hpn-ssh stuff again but let me see what I can find. Is this for CentOS?

asdorsey commented 4 years ago

Yes, we're running CentOS 7.7 on the DTNs.

fscheiner commented 4 years ago

Let me go back and check something. I was talking to the gsi people and I think they picked up the hpn-ssh stuff again but let me see what I can find. Is this for CentOS?

Sorry @rapier1, the HPN patches weren't yet included in the GSI-OpenSSH packages for EPEL and Fedora AFAICT. I am currently re-enabling them for SUSE packages and need to test them afterwards. An issue with the GSI patch and missing time kept me from continuing this earlier. I'll then see into how the HPN patches can be included in the GSI-OpenSSH packages for EPEL and Fedora - if possible.

@ellert, @matyasselmeci, @msalle: Back in December 2019 I successfully tested sending and receiving file data with multithreaded AES-CTR enabled by using a gsiscp from a GSI-OpenSSH 7.9p1 package w/HPN patches for openSUSE Leap 15.1 against an HPN enabled gsisshd at a remote site. So the client already worked properly for me. I haven't yet tested the server - I don't remember exactly why, but most likely because of the - since then solved - issue with the GSI patch which prevented connections to gsisshd's from 7.8p1 and up. So far I didn't see any issues during inclusion of the HPN patches into GSI-OpenSSH 7.6p1 and 7.9p1 for SUSE, just a few reorderings were needed. I included the HPN patch on top of the GSI patch and all the other SUSE patches for the SUSE packages.

So what do you think about starting to include the HPN patches in the EPEL/Fedora packages?

msalle commented 4 years ago

If we (you (-; ) can test them and show them to work, then I certainly think it would be a very valuable addition. I personally probably won't have many cycles for it the coming month.

asdorsey commented 4 years ago

I would be very happy to test any packages that you guys can build for me. I have test DTNs that I can modify as needed.

fscheiner commented 4 years ago

@msalle

If we (you (-; ) can test them and show them to work, then I certainly think it would be a very valuable addition. I personally probably won't have many cycles for it the coming month.

Sure :-), but it will take me some time.

@adorsey-NOAA

I would be very happy to test any packages that you guys can build for me. I have test DTNs that I can modify as needed.

Ok, great, then I could spread the load of testing.

fscheiner commented 4 years ago

Geez, spoken to soon. :-/ I have a compile error for GSI-OpenSSH 7.6p1 w/HPN for SLES 15:

from the build log:

[...]
[   80s] gcc -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -fpie -fstack-protector -pipe -Wall -Wpointer-arith -Wuninitialized -Wsign-compare -Wformat-security -Wsizeof-pointer-memaccess -Wno-pointer-sign -Wno-unused-result -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -ftrapv -fno-builtin-memset -fstack-protector-strong -fPIE   -I. -I.. -I. -I./..  -D_XOPEN_SOURCE=600 -D_BSD_SOURCE -D_DEFAULT_SOURCE -I/usr/include/editline -DLDAP_DEPRECATED -DOPENSSL_LOAD_CONF -I/usr/include/globus  -DHAVE_CONFIG_H -c bsd-snprintf.c
[   80s] cipher-ctr-mt.c: In function 'ssh_aes_ctr':
[   80s] cipher-ctr-mt.c:442:18: error: dereferencing pointer to incomplete type 'EVP_CIPHER_CTX {aka struct evp_cipher_ctx_st}'
[   80s]    ssh_ctr_inc(ctx->iv, AES_BLOCK_SIZE);
[   80s]                   ^~
[   80s] cipher-ctr-mt.c: In function 'evp_aes_ctr_mt':
[   80s] cipher-ctr-mt.c:652:20: error: storage size of 'aes_ctr' isn't known
[   80s]   static EVP_CIPHER aes_ctr;
[   80s]                     ^~~~~~~
[   80s] cipher-ctr-mt.c:654:29: error: invalid application of 'sizeof' to incomplete type 'EVP_CIPHER {aka struct evp_cipher_st}'
[   80s]   memset(&aes_ctr, 0, sizeof(EVP_CIPHER));
[   80s]                              ^~~~~~~~~~
[   80s] cipher-ctr-mt.c:652:20: warning: unused variable 'aes_ctr' [-Wunused-variable]
[   80s]   static EVP_CIPHER aes_ctr;
[   80s]                     ^~~~~~~
[   80s] cipher-ctr-mt.c:667:1: warning: control reaches end of non-void function [-Wreturn-type]
[   80s]  }
[   80s]  ^
[   80s] make: *** [Makefile:171: cipher-ctr-mt.o] Error 1
[   80s] make: *** Waiting for unfinished jobs....
[...]

Maybe an include is missing, but could also be due to the build using OpenSSL 1.1.0. I see conditionals for different OpenSSL versions there in later versions of the HPN patch.

@rapier1 The HPN patch 14v15 for OpenSSH 7.6p1 doesn't work with OpenSSL 1.1.0, right?

I used this RPM spec file and this adapted HPN patch which is based on https://sourceforge.net/projects/hpnssh/files/Patches/HPN-SSH%2014v15%207.6p1/openssh-7_6_P1-hpn-14.15.diff/download.

asdorsey commented 4 years ago

Is there any progress on adding HPN-SSH to the GCT SSH server packages in EPEL for CentOS/RHEL? Is there anything I can test or help with?

fscheiner commented 4 years ago

Is there any progress on adding HPN-SSH to the GCT SSH server packages in EPEL for CentOS/RHEL? Is there anything I can test or help with?

I have created GSISSH packages with HPN functionality for EPEL7 based on the GSISSH source package from EPEL7. This is based on OpenSSH 7.4p1 so the HPN 14.13 patch was used. Oh, just noticed, that @rapier1 already created a source RPM for this version with GSI and HPN support. I did not notice that it was already available on SourceForge. Bummer, that could have saved me some time. Anyhow, functionality should be identical, and both are based on the same source package from EPEL7.

I needed to make a change to the server version string as provided by the original HPN 14.13 patch. It seems to not hinder detection of an HPN enabled gsisshd.

The only problem I noticed during my testing is that the gsisshd from this package seems to not work correctly, when using the multithreaded AES-CTR cipher. This happens with the 7.4p1 gsiscp from the corresponding clients package and also with the 7.9p1 gsiscp from openSUSE Leap 15.1 (also with HPN patches). I don't really see an obvious issue in the debug output, but the client exits with 1 w/o transferring data, the server continues to run unaffected. Also running the gsisshd through gdb didn't give me any hints. @rapier1: if you like we can try to debug this. But in general HPN functionality seems to work in this version, as I see window scaling happening when using another cipher than the multithreaded AES cipher, and such is not done with stock SSH IIUC.

The 7.4p1 client worked correctly for me, as I could successfully transfer and receive data from a 7.9p1 gsisshd with HPN enabled when using the multithreaded AES-CTR cipher:

$ gsiscp -c aes256-ctr -P 2222 server.domain.tld:100M /dev/null
[...]

$ gsiscp -c aes256-ctr -P 2222 ./100M server.domain.tld:/dev/null
[...]
debug1: REQUESTED ENC.NAME is 'aes256-ctr'
debug1: kex: server->client cipher: aes256-ctr MAC: umac-64-etm@openssh.com compression: none
debug1: REQUESTED ENC.NAME is 'aes256-ctr'
debug1: kex: client->server cipher: aes256-ctr MAC: umac-64-etm@openssh.com compression: none
[...]
debug2: set_newkeys: mode 1
debug1: ssh_set_newkeys: rekeying after 493 output blocks (15640 bytes total)
debug1: spawned a thread
debug1: spawned a thread
debug1: rekey after 4294967296 blocks
debug1: dequeue packet: 90
debug3: send packet: type 90
debug1: dequeue packet: 80
debug3: send packet: type 80
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: receive packet: type 21
debug1: SSH2_MSG_NEWKEYS received
debug2: set_newkeys: mode 0
debug1: ssh_set_newkeys: rekeying after 704 input blocks (21960 bytes total)
debug1: spawned a thread
debug1: spawned a thread
debug1: rekey after 4294967296 blocks
[...]

...and the debug output shows that multiple threads were spawned.

@adorsey-NOAA: The GSISSH w/HPN packages for CentOS 7 can be installed using the instructions on https://software.opensuse.org//download.html?project=home%3Afrank_scheiner%3Agct-epel&package=gsi-openssh. Just select "CentOS " and then click on "Add repository and install manually" to reveal the instructions. Maybe you could give these a try and test if your usual use cases work with that version. Would you please report back your results?

dsimmel commented 4 years ago

I've forwarded your post to Chris Rapier here at PSC to call his attention to it. - Derek

On Aug 14, 2020, at 2:11 PM, fscheiner notifications@github.com wrote:

Is there any progress on adding HPN-SSH to the GCT SSH server packages in EPEL for CentOS/RHEL? Is there anything I can test or help with?

I have created GSISSH packages with HPN functionality for EPEL7 based on the GSISSH source package from EPEL7. This is based on OpenSSH 7.4p1 so the HPN 14.13 patch was used. Oh, just noticed, that @rapier1 already created a source RPM for this version with GSI and HPN support. I did not notice that it was already available on SourceForge. Bummer, that could have saved me some time. Anyhow, functionality should be identical, and both are based on the same source package from EPEL7.

I needed to make a change to the server version string as provided by the original HPN 14.13 patch. It seems to not hinder detection of an HPN enabled gsisshd.

The only problem I noticed during my testing is that the gsisshd from this package seems to not work correctly, when using the multithreaded AES-CTR cipher. This happens with the 7.4p1 gsiscp from the corresponding clients package and also with the 7.9p1 gsiscp from openSUSE Leap 15.1 (also with HPN patches). I don't really see an obvious issue in the debug output, but the client exits with 1 w/o transferring data, the server continues to run unaffected. Also running the gsisshd through gdb didn't give me any hints. @rapier1: if you like we can try to debug this. But in general HPN functionality seems to work in this version, as I see window scaling happening when using another cipher than the multithreaded AES cipher, and such is not done with stock SSH IIUC.

The 7.4p1 client worked correctly for me, as I could successfully transfer and receive data from a 7.9p1 gsisshd with HPN enabled when using the multithreaded AES-CTR cipher:

$ gsiscp -c aes256-ctr -P 2222 server.domain.tld:100M /dev/null [...]

$ gsiscp -c aes256-ctr -P 2222 ./100M server.domain.tld:/dev/null [...] debug1: REQUESTED ENC.NAME is 'aes256-ctr' debug1: kex: server->client cipher: aes256-ctr MAC: umac-64-etm@openssh.com compression: none debug1: REQUESTED ENC.NAME is 'aes256-ctr' debug1: kex: client->server cipher: aes256-ctr MAC: umac-64-etm@openssh.com compression: none [...] debug2: set_newkeys: mode 1 debug1: ssh_set_newkeys: rekeying after 493 output blocks (15640 bytes total) debug1: spawned a thread debug1: spawned a thread debug1: rekey after 4294967296 blocks debug1: dequeue packet: 90 debug3: send packet: type 90 debug1: dequeue packet: 80 debug3: send packet: type 80 debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug3: receive packet: type 21 debug1: SSH2_MSG_NEWKEYS received debug2: set_newkeys: mode 0 debug1: ssh_set_newkeys: rekeying after 704 input blocks (21960 bytes total) debug1: spawned a thread debug1: spawned a thread debug1: rekey after 4294967296 blocks [...]

...and the debug output shows that multiple threads were spawned.

@adorsey-NOAA: The GSISSH w/HPN packages for CentOS 7 can be installed using the instructions on https://software.opensuse.org//download.html?project=home%3Afrank_scheiner%3Agct-epel&package=gsi-openssh. Just select "CentOS " and then click on "Add repository and install manually" to reveal the instructions. Maybe you could give these a try and test if your usual use cases work with that version. Would you please report back your results?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.


Derek Simmel Pittsburgh Supercomputing Center dsimmel@psc.edu +1 (412) 268-1035

asdorsey commented 4 years ago

@fscheiner thanks for the link to the packages. I'll get those onto our test DTN and do some testing this week.

asdorsey commented 4 years ago

@fscheiner Unfortunately, it seems that the gsi-openssh-server from these packages segfault during data transfer.

Aug 19 15:10:13 jdtn1 gsisshd[165870]: error: Could not load host key: /etc/gsissh/ssh_host_dsa_key
Aug 19 15:10:14 jdtn1 systemd: Created slice User Slice of Adam.Dorsey.
Aug 19 15:10:14 jdtn1 systemd-logind: New session 143581 of user Adam.Dorsey.
Aug 19 15:10:14 jdtn1 systemd: Started Session 143581 of user Adam.Dorsey.
Aug 19 15:10:14 jdtn1 kernel: gsisshd[165886]: segfault at 7ff321ee89d0 ip 00007ff32a714b41 sp 00007ffd8d8aa7e0 error 4 in libpthread-2.17.so[7ff32a708000+17000]
Aug 19 15:10:14 jdtn1 systemd-logind: Removed session 143581.
Aug 19 15:10:14 jdtn1 systemd: Removed slice User Slice of Adam.Dorsey.

I can try to get a memory dump if that will help.

fscheiner commented 4 years ago

@fscheiner Unfortunately, it seems that the gsi-openssh-server from these packages segfault during data transfer.

Strange, I didn't notice that during testing locally. What was the exact client command you used when testing?

fscheiner commented 4 years ago

@adorsey-NOAA UPDATE: Ok, I see it segfault, too - assumingly the child as the parent continues to exist. But still only when using the multithreaded AES-CTR chipher. As mentioned, the 7.4p1 client works with a 7.9p1 server on another OS (see https://github.com/gridcf/gct/issues/108#issuecomment-674199057), so the problem seems to lie in the server (implementation). I'll again try to examine the situation with gdb.

asdorsey commented 4 years ago

@fscheiner I was using gsissh for testing from one of our remote sites in Boulder, CO to Fairmont, WV where the test DTN is located. The test file, bigfile, is a ~2.2GB file created from /dev/zero.

[Adam.Dorsey@fe2 Adam.Dorsey]$ pwd
/lfs4/SYSADMIN/jetmgmt/Adam.Dorsey
[Adam.Dorsey@fe2 Adam.Dorsey]$ ls -hl biggerfile
-rw-r--r-- 1 Adam.Dorsey jetmgmt 2.2G Nov 21  2019 biggerfile
[Adam.Dorsey@fe2 Adam.Dorsey]$ gsiscp bigfile Adam.Dorsey@jdtn2.fairmont.rdhpcs.noaa.gov:/tds_scratch1/SYSADMIN/nesccmgmt/Adam.Dorsey/bigfile.jet
lost connection
[Adam.Dorsey@fe2 Adam.Dorsey]$

The following message is displayed in /var/log/messages on the target host:

Aug 20 15:34:36 jdtn2 gsisshd[4241]: rexec line 47: Deprecated option RSAAuthentication
Aug 20 15:34:36 jdtn2 gsisshd[4241]: rexec line 56: Deprecated option RhostsRSAAuthentication
Aug 20 15:34:36 jdtn2 gsisshd[4241]: rexec line 149: Unsupported option DisableUsageStats
Aug 20 15:34:36 jdtn2 gsisshd[4241]: error: Could not load host key: /etc/gsissh/ssh_host_dsa_key
Aug 20 15:34:37 jdtn2 systemd: Created slice User Slice of Adam.Dorsey.
Aug 20 15:34:37 jdtn2 systemd-logind: New session 143674 of user Adam.Dorsey.
Aug 20 15:34:37 jdtn2 systemd: Started Session 143674 of user Adam.Dorsey.
Aug 20 15:34:37 jdtn2 kernel: gsisshd[4248]: segfault at 7f65596f79d0 ip 00007f6561f23b41 sp 00007fff87c14be0 error 4 in libpthread-2.17.so[7f6561f17000+17000]
Aug 20 15:34:37 jdtn2 systemd-logind: Removed session 143674.
Aug 20 15:34:37 jdtn2 systemd: Removed slice User Slice of Adam.Dorsey.

And in /var/log/secure:

Aug 20 15:34:36 jdtn2 gsisshd[4241]: SSH: Server;Ltype: Version;Remote: 140.208.160.2-53644;Protocol: 2.0;Client: OpenSSH_7.5p1b-GSI NMOD_3.19-hpn14v13 GSI
Aug 20 15:34:36 jdtn2 gsisshd[4241]: SSH: Server;Ltype: Kex;Remote: 140.208.160.2-53644;Enc: aes128-ctr;MAC: umac-64@openssh.com;Comp: none [preauth]
Aug 20 15:34:37 jdtn2 gsisshd[4241]: SSH: Server;Ltype: Authname;Remote: 140.208.160.2-53644;Name: Adam.Dorsey [preauth]
Aug 20 15:34:37 jdtn2 gsisshd[4241]: reprocess config line 47: Deprecated option RSAAuthentication
Aug 20 15:34:37 jdtn2 gsisshd[4241]: reprocess config line 56: Deprecated option RhostsRSAAuthentication
Aug 20 15:34:37 jdtn2 gsisshd[4241]: GSI user <REDACTED> is authorized as target user Adam.Dorsey
Aug 20 15:34:37 jdtn2 gsisshd[4241]: Accepted gssapi-keyex for Adam.Dorsey from 140.208.160.2 port 53644 ssh2
Aug 20 15:34:37 jdtn2 gsisshd[4241]: pam_unix(gsisshd:session): session opened for user Adam.Dorsey by (uid=0)
Aug 20 15:34:37 jdtn2 gsisshd[4243]: SSH: Server;Ltype: Kex;Remote: 140.208.160.2-53644;Enc: aes128-ctr;MAC: umac-64@openssh.com;Comp: none
Aug 20 15:34:37 jdtn2 gsisshd[4243]: Received disconnect from 140.208.160.2 port 53644:11: disconnected by user
Aug 20 15:34:37 jdtn2 gsisshd[4243]: Disconnected from 140.208.160.2 port 53644
Aug 20 15:34:37 jdtn2 gsisshd[4241]: pam_unix(gsisshd:session): session closed for user Adam.Dorsey

I just saw your update, I'm glad that you were able to replicate the issue. Please let me know if you want me to collect a crash dump as well.

fscheiner commented 4 years ago

@adorsey-NOAA

I just saw your update, I'm glad that you were able to replicate the issue. Please let me know if you want me to collect a crash dump as well.

Maybe we wait for a comment by @rapier1. He could tell us exactly what is needed for debugging.

fscheiner commented 4 years ago

Ok, I got a backtrace. The segfault doesn't happen when running under gdb, so I examined a core dump:

(gdb) bt
#0  pthread_cancel (th=140019418441472) at pthread_cancel.c:33
#1  0x000055a603d234a0 in stop_and_join_pregen_threads (c=c@entry=0x7f58db177010) at cipher-ctr-mt.c:234
#2  0x000055a603d2351e in ssh_aes_ctr_cleanup (ctx=0x55a60476b650) at cipher-ctr-mt.c:572
#3  0x00007f58d9fea597 in EVP_CIPHER_CTX_cleanup (c=c@entry=0x55a60476b650) at evp_enc.c:621
#4  0x00007f58d9fea69e in EVP_CIPHER_CTX_free (ctx=0x55a60476b650) at evp_enc.c:613
#5  0x000055a603d22ee9 in cipher_free (cc=0x55a60476d2c0) at cipher.c:561
#6  0x000055a603d2acfe in packet_destroy_state (state=0x55a60476f330) at packet.c:2507
#7  packet_destroy_all (audit_it=audit_it@entry=0, privsep=privsep@entry=1) at packet.c:2536
#8  0x000055a603cf1e12 in child_destory_sensitive_data () at session.c:1570
#9  0x000055a603cf372e in do_exec_no_pty (s=0x55a60476daf0, command=0x55a60476e380 "scp -f 100M") at session.c:430
#10 0x000055a603cf48dd in do_exec (s=s@entry=0x55a60476daf0, command=<optimized out>, command@entry=0x55a60476e380 "scp -f 100M") at session.c:734
#11 0x000055a603cf4dd4 in session_exec_req (s=0x55a60476daf0) at session.c:2127
#12 session_input_channel_req (c=c@entry=0x55a6047ee8e0, rtype=<optimized out>, rtype@entry=0x55a60476b7a0 "exec") at session.c:2217
#13 0x000055a603ceb7bb in server_input_channel_req (type=<optimized out>, seq=<optimized out>, ctxt=<optimized out>) at serverloop.c:855
#14 0x000055a603d31f09 in ssh_dispatch_run (ssh=ssh@entry=0x55a60476ead0, mode=mode@entry=1, done=done@entry=0x0, ctxt=0x55a60476ead0) at dispatch.c:119
#15 0x000055a603d31f59 in ssh_dispatch_run_fatal (ssh=0x55a60476ead0, mode=mode@entry=1, done=done@entry=0x0, ctxt=<optimized out>) at dispatch.c:140
#16 0x000055a603cecc12 in process_buffered_input_packets () at serverloop.c:345
#17 server_loop2 (authctxt=authctxt@entry=0x55a60476e8c0) at serverloop.c:401
#18 0x000055a603cf40e2 in do_authenticated2 (authctxt=0x55a60476e8c0) at session.c:2670
#19 do_authenticated (authctxt=authctxt@entry=0x55a60476e8c0) at session.c:274
#20 0x000055a603ce05d9 in main (ac=<optimized out>, av=<optimized out>) at sshd.c:2293

Used client command was:

gsiscp -c aes256-ctr -P 2222 gridftp-5.[...]:100M /dev/null

No data was transferred.

@rapier1: HTH

fscheiner commented 4 years ago

@adorsey-NOAA Should we also already create packages with HPN features for CentOS8's GSISSH version?

asdorsey commented 4 years ago

@fscheiner That sounds like a good idea. I don't have any CentOS 8 systems currently, so I can't help you test there, but having those available would be a nice thing to have.

fscheiner commented 4 years ago

@fscheiner That sounds like a good idea. I don't have any CentOS 8 systems currently, so I can't help you test there, but having those available would be a nice thing to have.

OK, GSI-OpenSSH w/HPN (based on gsi-openssh-8.0p1-3) packages should soon be available from https://software.opensuse.org//download.html?project=home%3Afrank_scheiner%3Agct-epel-8&package=gsi-openssh. Use gsi-openssh-[{clients-|server-}]8.0p1-4.1 when installing.

I succesfully tested client (gsisscp) and server (gsisshd) from these packages running on a CentOS 8 driven VM using the MT AES cipher against/with a GSI-OpenSSH w/HPN 7.9p1 server/client on openSUSE Leap 15.1.

asdorsey commented 4 years ago

@fscheiner Thanks, if I can get some test nodes/VMs with CentOS 8 spun up I'll run some tests in between sites.

@rapier1 Any ideas or progress on the segfault issues with the CentOS 7 packages?

fscheiner commented 3 years ago

@rapier1 fixed https://github.com/rapier1/openssh-portable/issues/23 and GSI-OpenSSH packages w/HPN for EPEL 8 (gsi-openssh-8.0p1-5.1) and Fedora 32 (gsi-openssh-8.3p1-5.1) were already updated. More to come in the future.

asdorsey commented 3 years ago

@rapier1 anything on the CentOS 7 server issues @fscheiner and I previously reported?

rapier1 commented 3 years ago

I'm getting back to those now. I'm sorry of the delay but these past few months have been an all hands on deck to prep for standing up Bridges2. I'm doing a lot of the prep work for the data movement to the new file system and that basically killed 3 months of my work life. :)

Anyway, I'm reviewing old hpn-ssh things now. I hope to get to this issue by this afternoon.

Chris

On 9/28/20 11:08 AM, adorsey-NOAA wrote:

@rapier1 https://github.com/rapier1 anything on the CentOS 7 server issues @fscheiner https://github.com/fscheiner and I previously reported?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gridcf/gct/issues/108#issuecomment-700068170, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKL66EO4UQO2GXMVK5KCHTSICRILANCNFSM4JEZMT3Q.

asdorsey commented 3 years ago

@rapier1 Thanks for the update. I completely understand the priority project scramble; been there, done that. Thanks for reviewing the issues we're seeing, and let me know if I can provide any additional information.

asdorsey commented 3 years ago

@rapier1 another issue to report, though this may be related to the existing issues I've already mentioned.

I've been running down a performance problem on our DTNs; specifically, when users try to transfer files using rsync, performance is awful (~20MB/s). With scp, we can get 250MB/s from the same remote site. This is accompanied by 100% CPU usage of the gsisshd child process for that rsync process on the DTN.

I moved our DTNs to the gsi-openssh-server packages provided by XSEDE (https://software.xsede.org/production/gsi-openssh-server/7.5p1b-1/XSEDE-GSI-OpenSSH-install.html) and that fixed the performance issue for us, so it appears to be related to the gsi-openssh-7.6p1 packages we had been previously running.

Sorry, I know I don't have any more information other than "it's slow". I still have the test DTNs, and I can use those to test any updated packages for CentOS 7 that you or the GCT team can provide.