Closed justinas-marozas closed 1 year ago
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
The problem
Updating an aspect in GMS translates into two operations in database:
It is important to execute both operations atomically and in isolation to avoid inconsistent state.
GMS was initially designed to work with a relational database and the solution here is easy as you can begin and commit transactions at will.
With Cassandra backend, atomic/isolated execution of these two operations is possible by using Cassandra batches, but it is not currently used, meaning that GMS with a Cassandra backend doesn't have the protection against inconsistent state. We would very much like to change that.
How things work now
AspectDao
interface exposes a methodrunInTransactionWithRetry
that allows execution of arbitrary code wrapped in a transaction.EntityService
makes use of this method to ensure these insert+update operations happen in a single transaction. This behavior can't be matched inCassandraAspectDao
.How we want things to work
AspectDao
interface should be changed so that its clients can't rely on transaction controls that may or may not be available depending on the backing data store. It's probably best to move this update+insert complexity toAspectDao
implementations to avoid leaking relational/cassandra implementation details toEntityService
.