apache / polaris

Apache Polaris, the interoperable, open source catalog for Apache Iceberg
https://polaris.apache.org/
Apache License 2.0
1.17k stars 130 forks source link

Do not persist plaintext secrets in the metastore #438

Closed eric-maynard closed 1 week ago

eric-maynard commented 2 weeks ago

Description

Currently, the entire PolarisPrincipalSecrets gets passed into the metastore for persistence. For EclipseLink, this means it gets translated into a ModelPrincipalSecrets which then gets written to the DB. Unfortunately, this means the plaintext client secrets are being persisted. These are then later used to check if provided secrets are valid.

This PR proposes that we persist a salted hash of these secrets, and that we no longer persist the plaintext secrets.

Important: Is it intended that after rotation secrets are still valid? Based on what I observed, it seems like we check secrets against both the main and secondary secret. If that's not intended it's easy enough to fix but this PR is a good time to do it since we are changing the semantics of the secret check.

Fixes #219

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

Added a unit test in PolarisEclipseLinkMetaStoreManagerTest covering rotation for old secrets.

Manual Testing:

  1. I confirmed that new principals do not persist plaintext secrets:
Screenshot 2024-11-07 at 2 55 01 PM

I am able to create new principals and use them without persisted plaintext secrets.

  1. When I roll back to a previous version of Polaris, new principals are again created with a plaintext secret:
Screenshot 2024-11-07 at 2 57 34 PM

Coming back to the new (this PR) version, I am able to authenticate principals using both the new + old formats.

  1. Finally, when I rotate-credentials the principal created with an old version of Polaris, which still has plaintext credentials, the hash gets added:
Screenshot 2024-11-07 at 3 01 29 PM

Note that the persisted secret doesn't go away, but it shouldn't be used anymore.

Checklist:

Please delete options that are not relevant.

collado-mike commented 1 week ago

The reason for two secrets is to support period rotation. If only one secret is valid, then an engineer can't rotate the secrets and push the new secret out to production in time. Instead, they can rotate, get the new secret and push to production while the old secrets still functions.

snazy commented 1 week ago

Hm - @eric-maynard looks like this change broke PolarisEclipseLinkMetaStoreManagerTest (via ./gradlew check or ./gradlew :polaris-eclipselink:test). Can you take a look? Not sure why CI's happy though.

snazy commented 1 week ago

Example failure:


jakarta.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 4.0.4.v202407190748-059428cdd2583c46f1f3e50d235854840a6fa9a7): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "MAINSECRETHASH" not found; SQL statement:
SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?) [42122-232]
Error Code: 42122
Call: SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?)
    bind => [a7eb510c3b83c8b2]
Query: ReadObjectQuery(referenceClass=ModelPrincipalSecrets sql="SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?)")
    at app//org.eclipse.persistence.internal.jpa.QueryImpl.getDetailedException(QueryImpl.java:392)
    at app//org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:265)
    at app//org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:475)
    at app//jakarta.persistence.TypedQuery.getResultStream(TypedQuery.java:87)
    at app//org.apache.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkStore.lookupPrincipalSecrets(PolarisEclipseLinkStore.java:422)
    at app//org.apache.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkMetaStoreSessionImpl.generateNewPrincipalSecrets(PolarisEclipseLinkMetaStoreSessionImpl.java:658)
    at app//org.apache.polaris.core.persistence.PolarisMetaStoreManagerImpl.createPrincipal(PolarisMetaStoreManagerImpl.java:959)
    at app//org.apache.polaris.core.persistence.PolarisMetaStoreManagerImpl.bootstrapPolarisService(PolarisMetaStoreManagerImpl.java:668)
    at app//org.apache.polaris.core.persistence.PolarisMetaStoreManagerImpl.lambda$bootstrapPolarisService$2(PolarisMetaStoreManagerImpl.java:706)
    at app//org.apache.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkMetaStoreSessionImpl.runActionInTransaction(PolarisEclipseLinkMetaStoreSessionImpl.java:288)
    at app//org.apache.polaris.core.persistence.PolarisMetaStoreManagerImpl.bootstrapPolarisService(PolarisMetaStoreManagerImpl.java:706)
    at app//org.apache.polaris.core.persistence.PolarisTestMetaStoreManager.<init>(PolarisTestMetaStoreManager.java:72)
    at app//org.apache.polaris.extension.persistence.impl.eclipselink.PolarisEclipseLinkMetaStoreManagerTest.createPolarisTestMetaStoreManager(PolarisEclipseLinkMetaStoreManagerTest.java:67)
    at app//org.apache.polaris.core.persistence.BasePolarisMetaStoreManagerTest.setupPolariMetaStoreManager(BasePolarisMetaStoreManagerTest.java:72)
    at java.base@22.0.2/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base@22.0.2/java.util.ArrayList.forEach(ArrayList.java:1597)
    at java.base@22.0.2/java.util.ArrayList.forEach(ArrayList.java:1597)
Caused by: Exception [EclipseLink-4002] (Eclipse Persistence Services - 4.0.4.v202407190748-059428cdd2583c46f1f3e50d235854840a6fa9a7): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "MAINSECRETHASH" not found; SQL statement:
SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?) [42122-232]
Error Code: 42122
Call: SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?)
    bind => [a7eb510c3b83c8b2]
Query: ReadObjectQuery(referenceClass=ModelPrincipalSecrets sql="SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?)")
    at app//org.eclipse.persistence.exceptions.DatabaseException.sqlException(DatabaseException.java:343)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:702)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:569)
    at app//org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2048)
    at app//org.eclipse.persistence.sessions.server.ClientSession.executeCall(ClientSession.java:311)
    at app//org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:280)
    at app//org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:266)
    at app//org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectOneRow(DatasourceCallQueryMechanism.java:813)
    at app//org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectOneRowFromTable(ExpressionQueryMechanism.java:2912)
    at app//org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectOneRow(ExpressionQueryMechanism.java:2865)
    at app//org.eclipse.persistence.queries.ReadObjectQuery.executeObjectLevelReadQuery(ReadObjectQuery.java:563)
    at app//org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1236)
    at app//org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:913)
    at app//org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1195)
    at app//org.eclipse.persistence.queries.ReadObjectQuery.execute(ReadObjectQuery.java:448)
    at app//org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1283)
    at app//org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:3028)
    at app//org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1841)
    at app//org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1823)
    at app//org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1788)
    at app//org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:263)
    ... 15 more
Caused by: org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "MAINSECRETHASH" not found; SQL statement:
SELECT PRINCIPALCLIENTID, MAINSECRET, MAINSECRETHASH, PRINCIPALID, SECONDARYSECRET, SECONDARYSECRETHASH, SECRETSALT, VERSION FROM PRINCIPAL_SECRETS WHERE (PRINCIPALCLIENTID = ?) [42122-232]
    at app//org.h2.message.DbException.getJdbcSQLException(DbException.java:514)
    at app//org.h2.message.DbException.getJdbcSQLException(DbException.java:489)
    at app//org.h2.message.DbException.get(DbException.java:223)
    at app//org.h2.message.DbException.get(DbException.java:199)
    at app//org.h2.expression.ExpressionColumn.getColumnException(ExpressionColumn.java:244)
    at app//org.h2.expression.ExpressionColumn.optimizeOther(ExpressionColumn.java:226)
    at app//org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:213)
    at app//org.h2.command.query.Select.prepareExpressions(Select.java:1228)
    at app//org.h2.command.query.Query.prepare(Query.java:232)
    at app//org.h2.command.Parser.prepareCommand(Parser.java:489)
    at app//org.h2.engine.SessionLocal.prepareLocal(SessionLocal.java:645)
    at app//org.h2.engine.SessionLocal.prepareCommand(SessionLocal.java:561)
    at app//org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1164)
    at app//org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:93)
    at app//org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:315)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.prepareStatement(DatabaseAccessor.java:1750)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.prepareStatement(DatabaseAccessor.java:1697)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseCall.prepareStatement(DatabaseCall.java:770)
    at app//org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:630)
    ... 34 more
eric-maynard commented 1 week ago

Hi @snazy, do you see consistent failures? I was not able to repro:

LD4RTJ0HY9:polaris emaynard$ git checkout upstream/main && ./gradlew :polaris-eclipselink:test
HEAD is now at ab095cfa Do not persist plaintext secrets in the metastore (#438)
Configuration on demand is an incubating feature.

BUILD SUCCESSFUL in 733ms
19 actionable tasks: 19 up-to-date
eric-maynard commented 1 week ago

I wonder if your test is the result of what happens when you have an existing deployment already? Since it looks like the new column is totally missing.

However I manually tested this scenario, and I saw the column get added.

Let me know if you find out any more details about the error(s) you're seeing. If you want to revert while we debug, let me know and we can do that.

snazy commented 1 week ago

It's definitely an issue. Not saying, the root-cause is this change, but there's something odd. I've opened #450 for this.

johnnysohn commented 4 days ago

@eric-maynard what's the plan for schema changes on any tables in Polaris? Right now I don't see any DDL scripts to handle this additional MAINSECRETHASH column nor future schema changes and bootstrap command wouldn't run for already bootstrapped SQL databases.

eric-maynard commented 4 days ago

Hi @johnnysohn, thanks for raising this. I think we probably should figure out how to do metastore versioning at some point, but in the immediate term I would like to unblock anyone impacted by this change.

I think the issue is ultimately metastore-dependent and maybe even database-dependent. Of course, you could manually alter the schema of your database. But I have done some reading and I believe this EclipseLink parameter could resolve the issue:

<property name="eclipselink.ddl-generation" value="create-or-extend-tables"/>

If you are affected and able to confirm this works for you, we can consider just adding it to the default persistence.xml.

shantanu-dahiya commented 1 day ago

@eric-maynard I was using the principal_secrets table to find out the client ID and secret of the root principal, to be able to use the OAuth2 API. Where can I find it now?

eric-maynard commented 1 day ago

Hi @shantanu-dahiya -- please see this comment or #450 for further discussion