visit-dav / visit

VisIt - Visualization and Data Analysis for Mesh-based Scientific Data
https://visit.llnl.gov
BSD 3-Clause "New" or "Revised" License
438 stars 116 forks source link

client/server connections to RZ systems have broken #1403

Open aowen87 opened 5 years ago

aowen87 commented 5 years ago

Reported by Jay Javedani Everything was working fine until sometime earlier today, March 26, 2013. Even tried installing VisIt 2.6.2 and get same behavior. Jay is running VisIt on his windows machine and attempting to connect client/server to RZ systems. After going File->Open and selecting RZ Merl, VisIt just hangs. The password prompt never appears. I moved windows around to see if the prompt dialog was popping up under other windows and it wasn't. I checked his host profiles and they did have one issue. The username specified for Jay on RZ systems was set to 'javedani1' and there is no '1' at the end of his username there. However, changing it, saving settings, exiting and restarting though, that did not fix the problem. I have attached debug logs

-----------------------REDMINE MIGRATION----------------------- This ticket was migrated from Redmine. As such, not all information was able to be captured in the transition. Below is a complete record of the original redmine ticket.

Ticket number: 1395 Status: Pending Project: VisIt Tracker: Bug Priority: Normal Subject: client/server connections to RZ systems have broken Assigned to: - Category: - Target version: - Author: Mark Miller Start: 03/26/2013 Due date: % Done: 0% Estimated time: Created: 03/26/2013 07:13 pm Updated: 04/15/2013 01:47 pm Likelihood: 3 - Occasional Severity: 2 - Minor Irritation Found in version: 2.6.1 Impact: Expected Use: OS: All Support Group: Any Description: Reported by Jay Javedani Everything was working fine until sometime earlier today, March 26, 2013. Even tried installing VisIt 2.6.2 and get same behavior. Jay is running VisIt on his windows machine and attempting to connect client/server to RZ systems. After going File->Open and selecting RZ Merl, VisIt just hangs. The password prompt never appears. I moved windows around to see if the prompt dialog was popping up under other windows and it wasn't. I checked his host profiles and they did have one issue. The username specified for Jay on RZ systems was set to 'javedani1' and there is no '1' at the end of his username there. However, changing it, saving settings, exiting and restarting though, that did not fix the problem. I have attached debug logs

Comments: It looks like VisIt is running out of ports. It gets to 5624 and then stops searching.Rebooting and then running VisIt first thing after coming up and VisIt works in that case.I also asked him to install Cygwin to use Cygwin's XServer as a backup against this port number issue. The issue is that something (probably VisIt) has snapped up all of the ports from 56005623 and is holding on to them. When VisIt runs next time, it uses port 5624 since that's the next available port. The port forwarding code still starts numbering local ports at 5600 and sets up forwarding for ports 56005610. Of course, we're not using ports in that range so the port forwarding is defeated and the remote component cannot connect back through the tunnel.One fix is to change the port forwarding code in RemoteProcess::CreatePortNumbers so it starts the local ports at listenPortNum2. This is because the gui wants to connect at 5624. The viewer is launching the mdserver and will probably connect on port 5625. In order to do ssh tunneling the viewer also runs VCL to establish the tunnel and it will connect back on 5626, which by the time we're generating port numbers is the value of listenPortNum. Since we need the gui and mdserver ports tunneled as well, we need listenPortNum2 as the starting port. Of course, this only works if the ports are sequential.It's probably better to determine which port the gui will want to use and put it into an intVector that gets passed along to all of the launches that occur. By the time we're launching VCL, we have a list of ports that must be forwarded and we can do the right thing. Ed Kokko also running into issues connecting. His viewer debug 5 log is attached too. However, he gets further than Jay. He gets to port 5604, which is pretty common I think, but fails for other reasons after authenticating at the rzgw gateway. The mdserver never launches. Ed Kokko's problems were due to lack of having passwordless ssh working from rzgw to RZ systems. We have fixed that for him and it is now working for him.

cyrush commented 5 years ago

1403_javedani_viewer.exe.12060.5.vlog.zip 1403_viewer.exe.kokko.5.vlog.zip