indigo-iam / iam

INDIGO Identity and Access Management Service
https://indigo-iam.github.io/
Other
102 stars 43 forks source link

Token column in database is a limited index and can lead to a "Duplicate entry" error #579

Closed enricovianello closed 1 year ago

enricovianello commented 1 year ago

This is an example of the error got during a token exchange:

2022-09-28 10:52:04.146  INFO 8 --- [8080-exec-11416] i.i.m.i.c.o.g.TokenExchangeTokenGranter  : Client 'xxx' requests token exchange from client 'yyy' to impersonate user 'alpha' on audience 'wwwww' with scopes '[email, offline_access, openid, profile]'
2022-09-28 10:52:04.401  WARN 8 --- [8080-exec-11416] o.m.o.c.s.impl.DefaultOIDCTokenService   : Unable to find authentication timestamp! There is likely something wrong with the configuration.
[EL Warning]: 2022-09-28 10:52:04.407--UnitOfWork(1887491767)--Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.5.1.v20130918-f2b9fc5): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiI3ZjQwOTI4NC0xY' for key 'token_value'
Error Code: 1062
Call: INSERT INTO access_token (expiration, token_value, token_type, approved_site_id, auth_holder_id, client_id, refresh_token_id) VALUES (?, ?, ?, ?, ?, ?, ?)
    bind => [7 parameters bound]
Query: InsertObjectQuery(org.mitre.oauth2.model.OAuth2AccessTokenEntity@4b54cda5)

This is due to the fact that:

We need to understand the reason of this limitation (inherited from mitreId) and if this limitation can be removed or not.

vokac commented 1 year ago

This limit comes from MySQL column prefix limits. I think it is really not optimal to create such a huge index that basically double the size of access_token table a we already know from Halloween incident that this can leads to very poor IAM performance.

Why do you even create index on token content, this seems to me like not very optimal design choice considering you should be able to quickly search any token in the database by jti. Instead of trying to cover whole token_value size by index it would be better to fix code not to rely on this index and get rid of at_tv_idx.

enricovianello commented 1 year ago

The index on token_value is due to the UNIQUE constraint. Using jti (ref) could be a solution. I was thinking about adding an hash type column to access_token table, generated directly from token_value:

ALTER TABLE access_token DROP INDEX token_value; # remove current index
ALTER TABLE access_token ADD COLUMN token_value_hash hash UNIQUE NOT NULL CONSTRAINT default_token_hash DEFAULT SHA1(token_value);

and apply the UNIQUE constraint to that. This value for sure is similar of evaluating a jti and store it separately, indexed. What do you think @vokac @giacomini ?

vokac commented 1 year ago

This seems to me like a most simple solution with minimal code updates that would allow us to drop huge token_value index (assuming there are no later changes in this column) and it is more universal than relying on optional jti (this claim is required only by WLCG profile).

We should still consider more ambitious project of dropping whole token_value column from access token table, because it seems to me we should not really need full access token stored in the database.

giacomini commented 1 year ago

The hash of the full token is ok. Note however that we already have a hash available, which is the signature component of the token.