apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
915 stars 296 forks source link

[Improvement] Possible SQL injection in MysqlDatabaseOperations.java #4211

Open justinmclean opened 1 month ago

justinmclean commented 1 month ago

What would you like to be improved?

databaseName in generateDropDatabaseSql is not validated for any potential SQL issues.

How should we improve?

Validate databaseName

ria28 commented 1 month ago

I would like to work on this issue. To validate, do we just want to check that if databaseName is null or empty, then throw an exception?

zivali commented 1 month ago

@justinmclean Hi, I think before databaseName is passed into generateDropDatabaseSql, we already validate it with the capability framework (PR https://github.com/apache/gravitino/pull/2335). There are some tests that cover this validation in CatalogMysqlIT. Do we need to validate again in generateDropDatabaseSql?

justinmclean commented 1 month ago

We would need to more check more than if it null or empty. I'm not 100% sure that the current check in the capability formwork is enough as it tests for what might be a valid name, rather than malicious SQL. I can see delete calls dropDatabase which calls generateDropDatabaseSql and there doesn't seem to be any check on the name in that code path that I can see.

justinmclean commented 1 month ago

@zivali what are you thoughts on this?

zivali commented 1 month ago

Per my understanding, JdbcDatabaseOperations.delete currently will only be called by the dropSchema. During the dropSchema API call, the data will be routed to SchemaNormalizeDispatcher, in which we apply the capability formwork with CapabilityHelpers.applyCapabilities. This uses a whitelist approach to validate names. Tests cover some malicious SQL can be found in CatalogMysqlIT

I think if we assume JdbcDatabaseOperations.delete will be called by any other class in the future and won't be routed through SchemaNormalizeDispatcher, adding more validation here in generateDropDatabaseSql won't hurt.