IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
879 stars 492 forks source link

Single Sign On (SSO) from Dataverse to OpenStack #3765

Closed pdurbin closed 7 years ago

pdurbin commented 7 years ago

In #3747 we are adding a "Compute" button to Dataverse that will take the user from an installation of Dataverse to an installation of OpenStack from which they can run a compute job.

To run a compute job requires authentication. That is to say, you must log into OpenStack to do any computation.

Because running computation requires authentication, on the Dataverse side we will only show the "Compute" button if the user has been authenticated.

In term of authentication, we would love to have as seamless of a user experience as possible when clicking the "Compute" button. The best model we have for this comes from Shibboleth and SAML. With an installation of Dataverse set up for Shibboleth, you can first log into Dataverse using Shibboleth and then go to another site that supports Shibboleth, click "Log In" and get automatically logged into the second site without having to authenticate again.

For example, step 1 is to select "Boston University" from the "Your Institutional" (Shibboleth) dropdown in Dataverse

screen shot 2017-04-11 at 4 13 43 pm

Step 2 is to log in with your "Boston University" credentials:

screen shot 2017-04-11 at 4 14 12 pm

Step 3 is to observe that you are logged in to Dataverse:

screen shot 2017-04-11 at 4 16 22 pm

Step 4 is to log into a different website that supports Shibboleth/SAML such as https://www.hathitrust.org and find "Boston University" in their dropdown:

screen shot 2017-04-11 at 4 16 44 pm

Step 5 is to confirm that you are logged in to the second website (Hathi Trust in this example) without having to enter your Boston University credentials again:

screen shot 2017-04-11 at 4 19 47 pm

The example above involving https://dataverse.harvard.edu and https://www.hathitrust.org works fine for both Boston University and UNC but not for Harvard. Harvard users are required to reauthenticate to HarvardKey when they select Harvard from the dropdown on https://www.hathitrust.org . This is suboptimal but outside the control of anyone but the people who run HarvardKey, we believe. I'm opening this issue to investigate, to see who I can talk to at HarvardKey. The hope is that if we get Single Sign On working for Hathi Trust then it will work for OpenStack as well.

See also the "2017-04-05 Auth for MOC and Dataverse meeting" notes at https://docs.google.com/document/d/1JrK0pSA0eAxOkYYLC68AI_uu-DxJ_R0iEnekZKUMbig/edit?usp=sharing where we formulated the strategy above to use SAML and Shibboelth.

Note that the OpenStack installation by MOC does not currently support Shibboleth/SAML but it's one of the tasks that @knikolla is working on. Currently, that OpenStack installation login interface looks like this:

screen shot 2017-04-11 at 4 28 41 pm

pdurbin commented 7 years ago

I'm opening this issue to investigate, to see who I can talk to at HarvardKey.

I just opened INC02185529 with Harvard.

pdurbin commented 7 years ago

Oh, I'm hoping the deliverable to the Dataverse code base will be an updated explanation of working with your Identity Provider, probably somewhere around http://guides.dataverse.org/en/4.6.1/installation/shibboleth.html#specific-identity-provider-s

pdurbin commented 7 years ago

I asked @djbrooke to help me follow up on INC02185529. If I don't hear anything, I'll ask at http://shibboleth.net/mailman/listinfo/users

pdurbin commented 7 years ago

Oh, at http://irclog.iq.harvard.edu/dataverse/2017-04-05#i_51269 @donsizemore had suggested to look for "ForceAuthn" in the SAML requests. I tried this yesterday using SAML Tracer ( https://addons.mozilla.org/en-US/firefox/addon/saml-tracer/ ) but didn't see anything.

pdurbin commented 7 years ago

I'm looking again at what @aivanov100 wrote at https://groups.google.com/d/msg/dataverse-community/Fc0wC4fLyeI/3crtMxIzDAAJ after he made pull request #3762 and it's eerily similar to this issue (emphasis mine):

The problem is that after the user is authenticated for either Drupal or Dataverse, and then navigates to the other site, the other site does not recognize the user as being logged-in and the user needs to click on Login again to be authenticated for the second site. For our SSO to work seamlessly, I would like to configure it so that the second site can recognize that the user has already been authenticated, and log the user in behind-the-scenes without any interaction from the user.

The puzzling thing is that Single Sign On (SSO) is working for UNC and BU users when then log into https://dataverse.harvard.edu and then visit https://www.hathitrust.org but not Harvard users. Since the only thing that's different is the Identity Provider (IdP) I assume that this is entirely out of our control since we don't run the IdP. Maybe it is out of our control. I'm not sure. Maybe @aivanov100 has some thoughts for us. 😄

sudoflyy commented 7 years ago

@pdurbin I wish I could be of greater help. My experience lies mainly in configuring our own IdP and SP for two applications, and I do not have much experience with troubleshooting the authentication process for external SPs.

I think that you were onto something when checking for the "forceAuthn" attribute, because the behavior of https://www.hathitrust.org would suggest that their SP is configured to ask for forced reauthentication from your IdP (bypassing SSO). https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPSessionInitiator

I would double-check with the tech team at Hathi Trust that their SP is configured exactly the same way for your Harvard IdP as it is for UNC and BU. If they confirm that it is, you should examine your IdP configuration. I cannot give you much insight as to what configuration setting for your IdP could be causing this behavior, but I think that it would be a good question to address to the Shibboleth Users mailing list: https://shibboleth.net/mailman/listinfo/users

My Best, Alex

pdurbin commented 7 years ago

@aivanov100 thanks. Again, I didn't see any trace of ForceAuthn. Yeah, I might try the shib users mailing list soon. I see that your pull request #3762 was merged! Great!

@djbrooke and others with access, yesterday I created a doc called "2017-04-?? HUIT IAM HarvardKey MOC" at https://docs.google.com/a/harvard.edu/document/d/15CynAiTKpPUHOBMDTjhPL3t-jz-qrA2XIsZpBWkuMzE/edit?usp=sharing where I've laid out my talking points for this issue as well as #3159 and #3749.

pdurbin commented 7 years ago

I just gave @djbrooke a brain dump after talking to HUIT INC02185529 today. "Something wrong our our end," they said. They'll be looking into it.

pdurbin commented 7 years ago

That was quick! HUIT just let me know they fixed in and sure enough, this time I get right in without having to authenticate to HarvardKey twice. Here are some screenshots:

screen shot 2017-04-20 at 4 35 17 pm

screen shot 2017-04-20 at 4 35 31 pm

screen shot 2017-04-20 at 4 35 38 pm

screen shot 2017-04-20 at 4 35 59 pm

screen shot 2017-04-20 at 4 36 08 pm

screen shot 2017-04-20 at 4 36 13 pm

screen shot 2017-04-20 at 4 36 21 pm

Thank you, Mahbub!!

@knikolla what do you think? Is it safe to close this issue or do you want to wait until more of the Shibboleth/SAML set up is done on your end?

pdurbin commented 7 years ago

I just spoke to @djbrooke and he said to go ahead and move this issue to QA, which I'll do in https://waffle.io/IQSS/dataverse

pdurbin commented 7 years ago

To QA:

What we're doing is validating the approach of using Shibboleth/SAML for Single Sign On (SSO). Hathi Trust was just the first example of an non-Harvard site that I can log into with HarvardKey.

kcondon commented 7 years ago

It works though I could not see much of my identity on the hathitrust site.