indigo-iam / voms-importer

A VOMS import script for INDIGO IAM
Other
1 stars 2 forks source link

Certificate subject and issuer conversion #7

Closed vokac closed 2 years ago

vokac commented 2 years ago

Implement direct DN conversion from VOMS to IAM format. Former implementation was not able to convert distinguished names with Email, sn, gn. Although most simple conversion could be implemented using ldap_str2dn function

def convert_dn_rfc2253_new(dn):
    try:
        parsed_dn = ldap.dn.str2dn(dn.encode('utf-8'),flags=ldap.DN_FORMAT_DCE)
        return ldap.dn.dn2str(parsed_dn)
    except ldap.DECODING_ERROR:
        _log.error("unable to decode dn: %s", dn)
    return ''

this doesn't work for distinguished names that contain / character (e.g. /DC=org/DC=opensciencegrid/O=Open Science Grid/OU=Services/CN=pilot/uclhc-fe-1.t2.ucsd.edu).

Although current implementation use naive and incomplete implementation of DN parsing with regex this still covers all certificate subjects and issuers for ATLAS VO. New function provides same result as original DN converter except it covers few more cases mentioned above.

Output DN imported in IAM doesn't have escaped special characters (e.g. comma), but this behavior is compatible with original dn_converter.

bbockelm commented 2 years ago

@giacomini - what's the latest on this one? Seems the character escaping issues have caused some users issues in CMS as well (e.g., GGUS 156765 which I believe you are CC'd on).

giacomini commented 2 years ago

No news wrt what is in the GGUS ticket, where I apparently rediscovered what Petr already did. I'll work on the importer and on these PRs in the next days.

bbockelm commented 2 years ago

Thanks @giacomini!

giacomini commented 2 years ago

I have a general comment about the changes: they may work on valid subjects of personal and issuer certificates currently in ATLAS, but they don't even pass the basic sanity checks implemented in the testsuite for the existing program. And no escaping is done on commas and similar characters. Is there any particular reason not to use the existing C++ program, apart from the obvious disadvantage of having a slight complication in the image building, which is however a solved problem?

More specific comments in the code.

vokac commented 2 years ago

It is much easier to fix directly python code than recompiling C sources. Also calling subprocess while same functionality can be easily achieved in python is not very nice solution.

I did not try to implement perfect parsing of DN and that means not all original (artificial) check can be passed by new code. On the other hand we have certificate subjects with e.g. cn=test/something that are not supported by C code and their parsing is not checked. I made an attempt to slightly improve regular expression assuming that relative distinguished value can't contain = sign. Only /D=ch/CN=pippo validation gives use different result, but this is requested behavior, because original code was not flexible enough to parse e.g. relative distinguished name email=foo@bar.com.

giacomini commented 2 years ago

It is much easier to fix directly python code than recompiling C sources. Also calling subprocess while same functionality can be easily achieved in python is not very nice solution.

I understand, but this is going to run offline in a container started from an image retrieved from docker hub, generated from a CI job. It's not run interactively, if not for testing purposes.

I did not try to implement perfect parsing of DN and that means not all original (artificial) check can be passed by new code. On the other hand we have certificate subjects with e.g. cn=test/something that are not supported by C code and their parsing is not checked.

Apart from the case, isn't it covered by this test, which passes successfully?

CHECK(to_rfc2253("/DC=ch/DC=cern/OU=computers/CN=unified/voms") == "CN=unified/voms,OU=computers,DC=cern,DC=ch");

I made an attempt to slightly improve regular expression assuming that relative distinguished value can't contain = sign. Only /D=ch/CN=pippo validation gives use different result, but this is requested behavior, because original code was not flexible enough to parse

It was not flexible on purpose, that's why we relied on OBJ_txt2nid to check if the type was valid. If that's not needed we can certainly change.

e.g. relative distinguished name email=foo@bar.com.

I remember some DNs with an email token, but I can't really recollect and there is no test for that unfortunately. Do you have a specific example, please?

vokac commented 2 years ago

I did not try to implement perfect parsing of DN and that means not all original (artificial) check can be passed by new code. On the other hand we have certificate subjects with e.g. cn=test/something that are not supported by C code and their parsing is not checked.

Apart from the case, isn't it covered by this test, which passes successfully?

CHECK(to_rfc2253("/DC=ch/DC=cern/OU=computers/CN=unified/voms") == "CN=unified/voms,OU=computers,DC=cern,DC=ch");

You are right, this works fine in dn_converter

e.g. relative distinguished name email=foo@bar.com.

I remember some DNs with an email token, but I can't really recollect and there is no test for that unfortunately. Do you have a specific example, please?

These accounts comes with certificate subject that can't be parsed by dn_converter

https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=881 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=2062 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=4558 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=5137

They contain Email=, sn= and gn= , but to be honest I'm not even 100% sure if they represent valid certificate subject.

We also have one account with comma in certificate subject and it is still not clear to me how it should appear in the IAM. It is the famous InCommon subject /DC=org/DC=incommon/C=US/ST=California/L=La Jolla/O=University of California, San Diego https://atlas-auth.web.cern.ch/dashboard#!/user/5d204d00-e392-4d79-9619-18fc4e8147c8

giacomini commented 2 years ago

These accounts comes with certificate subject that can't be parsed by dn_converter

https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=881 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=2062 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=4558 https://voms2.cern.ch:8443/voms/atlas/user/load.action?userId=5137

They contain Email=, sn= and gn= , but to be honest I'm not even 100% sure if they represent valid certificate subject.

Yes, I remember those from when I implemented dn_converter. They are not valid fields (sn and gn are valid if uppercase, Email was probably valid a long time ago) and we (Andrea and I) agreed not to accept them, even because they are very likely expired since a long time. Can you check the ones not issued by the INFN CA, please?

If they are expired I would simply ignore them, which is the status quo of the importer.

We also have one account with comma in certificate subject and it is still not clear to me how it should appear in the IAM. It is the famous InCommon subject /DC=org/DC=incommon/C=US/ST=California/L=La Jolla/O=University of California, San Diego https://atlas-auth.web.cern.ch/dashboard#!/user/5d204d00-e392-4d79-9619-18fc4e8147c8

That one and some members of UniHamburg. I have fixed the comma escape in my local branch.

I'll open a PR with my changes so that you can comment on the checks representing a sort of testsuite, so we can see if we cover all expected needs.

giacomini commented 2 years ago

For the moment I prefer to keep the C++ version, which I have fixed to escape some special chars, including ,.