USGS-CMG / data-life-cycle-cloud-docker-jupyterhub

data-life-cycle-cloud-docker-jupyterhub
https://jupyterhub.wma.usgs.chs.ead/
2 stars 1 forks source link

Issue setting up Globus Connect Personal #18

Open rsignell-usgs opened 7 years ago

rsignell-usgs commented 7 years ago

Setting up Globus Connect Personal takes 2 steps:

  1. Run globusconnectpersonal with -setup argument with a start up key generated on globusonline.org.
  2. Run globusconnectpersonal with -start argument.

I had trouble with step 1:

(globus) jovyan@52c8c98b3cd3:~/work/globusconnectpersonal-2.3.3$ ./globusconnectpersonal -setup b24fafee-98d9-48a7-9b34-85bf1db650dc
Traceback (most recent call last):
  File "/home/jovyan/work/globusconnectpersonal-2.3.3/gc-ctrl.py", line 61, in <module>
    os.environ['GCP_USER'] = os.environ['USER']
  File "/usr/lib/python2.7/UserDict.py", line 40, in __getitem__
    raise KeyError(key)
KeyError: 'USER'

So then I specified USER:

(globus) jovyan@52c8c98b3cd3:~/work/globusconnectpersonal-2.3.3$ export USER=rsignell

and tried again. This time I got:

(globus) jovyan@52c8c98b3cd3:~/work/globusconnectpersonal-2.3.3$ ./globusconnectpersonal -setup b24fafee-98d9-48a7-9b34-85bf1db650dc
Configuration directory: /home/jovyan/.globusonline/lta
Contacting relay.globusonline.org:2223
Error: The server returned an error
---
 ERROR: fd 3 was active
ERROR: fd 6 was active
ERROR: fd 8 was active
[ssh stderr] ssh: connect to host relay.globusonline.org port 2223: Connection timed out

@isuftin any ideas?

isuftin commented 7 years ago

@rsignell-usgs This is an issue with the AWS environment we find ourselves in. I am unable to connect to port 2223 of the machine you've described from within AWS.

I am, however, able to do this from the USGS WAN network. This means that the AWS environment is actively blocking outgoing connections over TCP to that port.

I will have to create a change request to our provider to open that port.

Do you know if there is an alternative method of performing this install?

From the USGS WAN:

> telnet relay.globusonline.org 2223
Trying 184.73.255.160...
Connected to relay.globusonline.org.
Escape character is '^]'.
SSH-2.0-OpenSSH_6.4p1-hpn14v1 GSI_GSSAPI_GPT_5.7 GSI
^]
telnet>

From an EC2 instance that the Docker service is running from:

$ telnet relay.globusonline.org 2223
Trying 184.73.255.160...
telnet: connect to address 184.73.255.160: Connection timed out
isuftin commented 7 years ago

Note to self: https://docs.globus.org/how-to/globus-connect-personal-linux/#globus-connect-personal-cli

rsignell-usgs commented 7 years ago

I think it needs this connection for more than just the install.

isuftin commented 7 years ago

I will create the request for implementation to our AWS provider to see if they can get this done.

isuftin commented 7 years ago

I've submitted the request for implementation. The voting process for this request should take place on 10/31. The earliest implementation for this would be 10/31

isuftin commented 7 years ago

@rsignell-usgs During the RFC process, a question came up as to whether or not there are alternative ways to install not only globus connect but also the software that globus will eventually be installing or pulling down. The question centers around security issues surrounding using a file sharing tool on federal infrastructure.

rsignell-usgs commented 7 years ago

@jtfalgout, sound familiar? Is it time to drag this issue out again? Last time we brought this up, didn't Tim Quinn say he would help us get a Globus Connect Server running at USGS?

@isuftin, we are currently using Globus to move data on and off of Yeti, which is a USGS machine. Does that give us any ammo here?

isuftin commented 7 years ago

@rsignell-usgs I think the push back will be that while you are using it to move data from an internal USGS machine, it can also be used to move data to/from an external resource as well.

Is this a one-time move or a continuous transfer as Yeti produces data?

rsignell-usgs commented 7 years ago

@isuftin, yest, we want to move data (model output) from USGS and non-USGS external resources (HPC on yeti, XSEDE, local computing cluster at university) onto USGS resources (CHS) for analysis/visualization in an automated fashion.

Globus is supported and endorsed for secure large file transfer a number of major organizations, including DOE, Argonne National Lab, and NIH.

@jfalgout, what would be the first steps to get USGS added to that list?

isuftin commented 7 years ago

@rsignell-usgs I've updated our provider with your comments and have cc'd you on the chain.

rsignell-usgs commented 7 years ago

@isuftin , saw that. Thanks. Fingers crossed.

isuftin commented 7 years ago

@rsignell-usgs The RFC resulted in our provider kicking up the decision to management as the request expanded into a larger question among the group for opening ports for services in general.

Will send you an email with convo bits