Closed duongnt closed 4 years ago
So I think that what you're trying to do could also be accomplished by overriding the normalize_username
method that is part of the Authenticator
class provided by Jupyterhub. So instead of yanking the username out of the email with XPath (which, ya, that ain't easy), it might be simpler to use the Python APIs to manipulate the string inside of the normalization function.
You would do this by writing something like the following in your Jupyterhub_config.py file:
def new_normailze_username(self, name):
# The name is now only the bit before the "@"
name = name.split("@")[0]
return super().normalize_username(name)
c.SAMLAuthenticator.normalize_username = new_normalize_username
I think we should take a hard look at how names are passed around in the code.
Thanks for the great issues and PRs.
Thanks for the prompt response, Tom! I guess I can do what you suggested, but I think in principle if we expect a string for username, then xpath should return exactly that? Converting from nodeset to string in xpath is trivial, so there's not many reasons to do that in python in my opinion. It's also kinda confusing for users like me - when I got the only the first letter I thought there's something wrong with my xpath syntax.
What do you think?
I think this is (probably) a breaking change, and we should fix it in the next Minor Point update when we break other things (misspellings, and such). Hopefully I'll have the types integrated into the code at that point as well, and we'll be able to use those to prove the code more correct (they shouldn't impact either functionality or runtime speed - they're just for my/our edification).
You're right that this could be a breaking change. How about we check the type of xpath result - if it's a list we return the first element, and if it's a string we return the whole string?
That seems like a good stopgap measure - then we break on Minor Version upgrade. Seems like a good plan.
Right now the default option for
xpath_username_location
returns a list of string. https://github.com/bluedatainc/jupyterhub-samlauthenticator/blob/master/samlauthenticator/samlauthenticator.py#L543 will then return the first element of the list.This creates a problem with xpath queries such as
substring-before(//saml:Attribute[@Name="User.email"]/saml:AttributeValue/text(), "@")
. When I tried settingxpath_username_location
to that value to get the email id (e.g. duongnt from duongnt@gmail.com), all I got was the first letterd
.I already have a diff to change the default
xpath_username_location
fromstring(//saml:NameID/text())
tostring(//saml:NameID/text())
, and remove the[0]
from L543 above, just wanted to make sure that this is a reasonable approach.I tried using xpath queries such as
//saml:Attribute[@Name="User.email"]/saml:AttributeValue/text()/substring-before(., "@")
, but got "Invalid expression" error. I'm a beginner with xpath and I don't know how to apply functions such assubstring-before
to every items of a node-sets. Actually I tried the same query on the same xml on https://www.freeformatter.com/xpath-tester.html, and it works perfectly fine, so maybe this is a limitation of the lxml/etree python package we use (ref: https://stackoverflow.com/questions/8692/how-to-use-xpath-in-python)