JiscSD / archivematica

Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
http://www.archivematica.org
GNU Affero General Public License v3.0
0 stars 0 forks source link

Shibboleth integration #5

Closed helenst closed 7 years ago

helenst commented 7 years ago

This is a WIP as it still needs to be rebased for the latest changes in qa/jisc and tested against rebased version of https://github.com/JiscRDSS/rdss-archivematica/pull/6.

Archivematica needs to know little about Shibboleth itself, it relies on Nginx to perform authentication and take care of the back and forth with the institution's identity providers. Once nginx has obtained an authenticated user, it will pass information about that user back to Archivematica in the form of headers containing information about the user's name, email address and permissions.

To deal with those headers, we use a third party library called django-shibboleth-remoteuser. It will use one of them as an identifying field and try to match that against a username it already knows about. If it can't find that user, it will create one. It will use the rest of the headers to add values to other fields on the user - so we can autofill things like the first and last names. It sets up its own Django session to keep the user logged in.

There are some implications here in terms of the way Archivematica normally works:

  1. The normal login form does not apply. The shibboleth system takes care of whether or not the request is authenticated, and only lets authenticated users get as far as Archivematica - so there's no need for the login form.

  2. Allowing users to edit their username, name and email address, or change their password, is also not applicable. User edit has been replaced with a simple display of their details. (Regenerate API key is still present).

  3. The welcome screen would normally be triggered by first access when there's no user, and asks the user for details of the initial user and their organisation. It then goes off to do useful stuff like generate the UUID for the dashboard. I took out the user creation part from this, so it will just prompt for the organisation info. It'll still only show it once, and it'll look at whether there's a UUID yet to decide whether to go into the welcome screen.

sevein commented 7 years ago

Looking really good, @helenst!

Are you planning to pin github.com/Brown-University-Library/django-shibboleth-remoteuser to a particular git tag or you're relying on unreleased code?

helenst commented 7 years ago

@sevein It seems like the best I can do at present is pin to a commit ID, as it's not been tagged in a long time. I suggested to them that perhaps a PyPI release would be in order. (They have been pretty responsive in the past so I have some hope there). The alternative would be to fork it and provide our own tag.

sevein commented 7 years ago

Looking great, thank you for the extra comments. I have one more question: would be possible to make this integration optional? Say I'm using the docker-compose.dev.yml environment that doesn't deploy Shibboleth or we want to include this in vanilla Archivematica. Could we put all the extra settings in a new settings module, e.g. settings.shibboleth extending settings.common? The urlpattern could be added inside an if clause, etc...

helenst commented 7 years ago

@sevein I think this could be made conditional. Most of it's contained within the back end and middleware, which could be added to the configuration in the Shib-specific config. The rest would mostly be about whether users can be edited or not, so displaying a read-only profile page vs. full user edit form, or which version of the welcome form we show. We could have some kind of generic flag in settings for that - I think the concept of a remote, read-only user that we don't want to edit, goes beyond just Shibboleth so it could be named something more general like ALLOW_USER_EDIT.

lower29 commented 7 years ago

I have rebased the dev/shib branch for this, following @sevein's recent changes to master to add installer CLI.

I think I've got the merge resolutions right (tested with rdss-archivematica) but welcome someone with more familiarity to confirm this. One outstanding issue I have is a ton of warnings from the FPR update phase. It looks like the FPR update is happening twice, but I'm not sure why. The dashboard works fine so it's an annoyance more than anything critical.

sevein commented 7 years ago

One outstanding issue I have is a ton of warnings from the FPR update phase. It looks like the FPR update is happening twice, but I'm not sure why. The dashboard works fine so it's an annoyance more than anything critical.

We can ignore that, @lower29. That's the output I also used to see when installing the Dashboard from the web interface. The difference is that since https://github.com/JiscRDSS/rdss-archivematica/pull/7 we're doing that from the command line causing the warnings to be sent to standard error. @jhsimpson may understand better what they mean but I'm sure it's fine.

sevein commented 7 years ago

@sevein I think this could be made conditional [...]

@helenst ok, I think we can deal with that stuff later. It seems more appropriate to get this merged soon and think about ways to improve it as we test it. Does that make sense?

lower29 commented 7 years ago

@sevein I'm happy to ignore these messages if you are!

I agree about merging, we have something that mostly works now so I think we should get it merged in and fix things like that later. I would guess that making shib optional would be required before merging into the main archivematica repo.

helenst commented 7 years ago

From my perspective, this is ready to merge alongside https://github.com/JiscRDSS/archivematica-storage-service/pull/3 and https://github.com/JiscRDSS/rdss-archivematica/pull/6