SAML-Toolkits / ruby-saml

SAML SSO for Ruby
MIT License
921 stars 567 forks source link

Support for SAML attribute values nested in <name> tags #717

Closed waissbluth closed 3 months ago

waissbluth commented 3 months ago

I am having issues parsing Cloudflare Zero Trust SAML assertions with OneLogin::RubySaml::Response.new. They seem to be special in that they have an Attribute (e.g., "groups") with nested AttributeValues with a <name> tag surrounding the value. These end up empty after parsing.

Is this something that ruby-saml can/should support?

It looks like this:

<saml2:Attribute Name="groups"
    NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:unspecified"
>
    <saml2:AttributeValue>
        <name xmlns="http://www.w3.org/2001/XMLSchema"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:type="xs:string"
        >Administrators</name>
    </saml2:AttributeValue>
    <saml2:AttributeValue>
        <name xmlns="http://www.w3.org/2001/XMLSchema"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:type="xs:string"
        >Everyone</name>
    </saml2:AttributeValue>
</saml2:Attribute>

A more standard assertion (e.g. from Okta) would not have the <name> tags and gets parsed correctly as an array with ["Administrators", "Everyone"]. But this one ends up with empty values instead.

Here is a hack that preprocesses the XML in a way that will make it parse correctly, but it is very much a hack:

def preprocess_saml_assertion_groups(raw_response, attribute)
   doc = Nokogiri::XML(raw_response, attribute)
    namespaces = {
      "saml2" => "urn:oasis:names:tc:SAML:2.0:assertion",
      "xs" => "http://www.w3.org/2001/XMLSchema",
    }
    doc
      .at_xpath("//saml2:Attribute[@Name=\"#{attribute}\"]", namespaces)
      &.xpath(".//saml2:AttributeValue", namespaces)
      &.each do |attr_value|
        name_element = attr_value.at_xpath(".//xs:name", namespaces)
        if name_element
          attr_value.children.remove
          attr_value.content = name_element.text.strip
        end
      end

    doc.to_xml
  end
johnnyshields commented 3 months ago

My system (TableCheck.com) actually implements this exact thing built on top of the gem.

Here's the relevant code (abridged but illustrates the gist.) Note that the values of saml_response.attributes can be arrays.

    def get_user(saml_response)
      # main flow
      nameids_from_attributes = extract_nameids_from_attributes(saml_response)
      user = get_user_by_nameid_or_attributes(nameid, nameids_from_attributes, saml_response)
    end

    private

    def get_user_by_nameid_or_attributes(nameid, nameids_from_attributes, saml_response)
      ([nameid] | (nameids_from_attributes || [])).compact_blank!.each do |name|
        val = User.where(username: nameid).first
        return val if val
      end
    end

    def extract_nameids_from_attributes(saml_response)
      return [] unless app_settings.match_nameid_from_attributes

      saml_response.attributes.map do |name, values|
        next unless name && NAME_ATTRIBUTES.include?(name.gsub(/[-_\s]/, '').downcase)
        Array(values)
      end.flatten.compact_blank!.uniq
    end
waissbluth commented 3 months ago

My system (TableCheck.com) actually implements this exact thing built on top of the gem.

@johnnyshields interesting! Do you encounter this on other IdPs as well?

If this is common / valid SAML, it would be nice if the gem had built-in support (either by default or behind a parsing setting for backward compatibility?). I am not terribly familiar with the standard here but it sounds like a reasonable thing to support?

johnnyshields commented 3 months ago

Most IDPs (95%) have a pretty vanilla implementation where NameID by itself is fine. I do have a few IdPs doing strange configurations and attributes are needed as a workaround. For those, the Anna Karenina principle applies: "All happy families are alike, but each unhappy family is unhappy in its own way."

The more fundamental question here is whether RubySaml is a "library" or a "framework"--library meaning "toolkit of basic adapters where you can write your own auth layer", or framework where it does all the auth work for you. I would say clearly it's the former (library), and so I do think that this use-case is out-of-scope. That being said, feel free to riff off my sample code above.

So after considering that I'm going to close this issue--probably won't be added to RubySaml.