indico / flask-multipass

Multi-backend authentication system for Flask
Other
61 stars 24 forks source link

Mail attribute from IdP is delivered as List of Addresses #56

Open gosogonzales opened 2 years ago

gosogonzales commented 2 years ago

Problem Description During the authentication procedure against our identity provider using the shibboleth method, one gets back a list of attributes including among others like Eppn, Cn, Givenname, Sn the attribute Mail. In our IdP case this is not a single email address, but a list of the primary address followed by an arbitrary list of aliases. This looks e.g. like:

Mail = 'primaryemail@eample.org;alias1@eample.org;alias2@example.org; ...'

Or given in the full context (as example):

{u'data': {u'get': {},
           u'headers': {'Accept': u'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
                        'Accept-Encoding': u'gzip, deflate, br',
                        'Accept-Language': u'de,en;q=0.8,fr;q=0.5,en-US;q=0.3',
...
                        'Host': u'indico-site.example.org',
                        'Mail': u'primaryemail@eample.org;alias1@eample.org;alias2@example.org',
                        'O': u'some-organisation',
                        'Ou': u'',
...                        

It seems, that indico requires a single email address at this point, as the data set shown upon login is the full string of all emails. This does not allow the Mail field to be used in the user data set in indico.

Solution Proposal It seems this problem could be solved by adding some parameters to the indico.conf syntax. One parameter covering the field separator and another one indicating which one of the fields should be taken as email address.

An indico.conf could for example look like the following:

IDENTITY_PROVIDERS = {
...
        'our-location-sso': {
        'type': 'shibboleth',
        'title': 'Shibboleth User Login)',
        'identifier_field': 'eppn',
# proposed new parameters:        
        'mail_array': True,
        'mail_separator': ';',
        'mail_index': 0,
# end of new parameters        
        'mapping': {
            'first_name': 'givenName',
            'uid': 'eppn', 
            'last_name': 'sn',
            'Sn': 'sn',
            'mail' : 'eppn',
...            
        }
}
...

This is my proposal. The example has three new parameters. mail_array defaulting to False, so that current configs will not get broken for backward compatibility, mail_separator containing the field separator of the string and mail_index providing the index in the mail array that provides the email address to be chosen by indico.

ThiefMaster commented 2 years ago

This sounds kind of non-standard. If you can't configure shibboleth to give you "clean" arguments with just a single value, I believe you're better off creating a custom flask-multipass extension (it's very easy, you can subclass the original shibboleth provider and expose it in a custom python package via setuptools entry points).

You can check this one to get a rough example on how to expose a custom one to flask-multipass.

gosogonzales commented 2 years ago

Thank you for your quick answer! Indeed, this method from our IdP is non-standard. Thank you for the link, we will check this alternative approach.

olifre commented 2 years ago

In fact, using an array here is the standard: https://wiki.refeds.org/display/STAN/eduPerson+2020-01#eduPerson202001-mail This has affected me (with my IdP) with many Indico instances "in the wild". Of course, if there's only a single mail address for an identity, it "just works", but the standard defines this variable to be an array.

ThiefMaster commented 2 years ago

Sounds like it's indeed worth fixing in https://github.com/indico/flask-multipass then (both for the shibboleth and the saml providers).

How standardized is the format for multiple values? If it's always ; maybe we could just always split and take the first one? I think the opt-in may not be needed at all since a valid email address won't contain an ; anyway.

In any case I propose moving the discussion over there since it's not Indico-specific.

olifre commented 2 years ago

How standardized is the format for multiple values?

As usual with standards, "it depends". From the REFEDS standard point of view, they use similar logic to LDAP, so there are just multi-valued attributes and no separator in a string. The conversion to string is done by shibd, on the service provider end. The default is ; and it was actually hardcoded in the past, but has become configurable in 2019 via a knob called attributeValueDelimiter: http://git.shibboleth.net/view/?p=cpp-sp.git;a=blobdiff;f=shibsp/ServiceProvider.cpp;h=0b184eb1a58924419d65ddb5bf6c91593a3605e3;hp=aacbba1409a52f30c01d3658c89850d793ecf72f;hb=8044713e31bb3f7b1702d64b1f571caa19bb7510;hpb=3e9f23bc0fdd8514faff5b62e9c377c651c47b1c For other applications (via environment variables), they use : though: http://git.shibboleth.net/view/?p=cpp-sp.git;a=blobdiff;f=mod_shibrm/mod_shibrm.cpp;h=7328a45ff2675bea8a8fddedf34b7a287e0fd1a8;hp=fed688ba06c4b9285f27194e5c8d8928b0627fd6;hb=2cc73582d5046e05eb6bb1fab08c11f6186f69b0;hpb=685769e28f161b0cdea35c602b45f7bc84b8b56e

So I think supporting ; will work for all standard cases, but having this configurable would be the most generic solution. At least, it should be the same separator for all multi-valued fields exported by one shibd instance. It seems others support ; only: https://github.com/toyokazu/omniauth-shibboleth/issues/18

In any case I propose moving the discussion over there since it's not Indico-specific.

Indeed, that sounds good. Do you want to move the issue there, or should a new one be created?