FusionAuth / fusionauth-issues

FusionAuth issue submission project
https://fusionauth.io
90 stars 12 forks source link

Fix SCIM filter to ensure we can always return an exact match when using `userName eq "user@example.com"` #2455

Closed robotdan closed 7 months ago

robotdan commented 1 year ago

Description

We are currently converting the SCIM filter parameter to an Elasticsearch query string. Because we index the username field as text the value is tokenized and as such we cannot perform an exact match unless we also tokenize the input to match all tokens generated by Elasticsearch.

For example, if you have a user with an email of erlich.bachman@piedpiper.com and separate user with a username of erlich.bachman, then the SCIM filter query userName eq "erlich.bachman@piedpiper.com" would return both users.

The reason for this is that this SCIM filter is translated to email:"erlich.bachman@piedpiper.com" OR username:"erlich.bachman@piedpiper.com".

Because Elasticsearch has tokenized the username field, there are two tokens erlich and bachman. The input to this query is then also tokenized, and the query username:"erlich.bachman@piedpiper.com" ends up match the second user with a username of erlich.bachman.

One option is to add a sub term on the username field in the index so we can optionally use a keyword search.

A common pattern is to add this config, and then username.exact would be the field to use if you want an exact match instead of a general text search.

The current Elasticsearch schema for the username field is:

  "username": {
    "type": "text",
    "fielddata": true,
  }

The modification would look like this:

  "username": {
    "type": "text",
    "fielddata": true,
    "fields": {
      "raw": {
        "type": "exact"
      }
    }
  }

Tasks

Related

Community guidelines

All issues filed in this repository must abide by the FusionAuth community guidelines.

robotdan commented 11 months ago

Internal: