apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
1.1k stars 345 forks source link

[Improvement] Fix bugs when adding entities with new name #203

Closed yuqi1129 closed 1 year ago

yuqi1129 commented 1 year ago

What would you like to be improved?

Bug description.

When we try to store an entity, we will use the following method to store entity value:

image

However, Id allocation in entityKeyEncoder requires this procedure should be in a transaction. then error occurs.

org.opentest4j.AssertionFailedError: getOrCreateId should be called in transaction
    at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
    at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
    at com.datastrato.graviton.storage.kv.TestEntityStorageBackend.testCreateKvEntityStore(TestEntityStorageBackend.java:203)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
    at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
    at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
    at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
    at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
    at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
    at 

Why existing UT can't detect this problem

UT testCreateKvEntityStore use transaction to store entity, see below


store.executeInTransaction(
          () -> {
            store.put(metalake);
            store.put(catalog);
            store.put(metalakeCopy);
            store.put(catalogCopy);
            store.put(catalogCopyAgain);
            return null;
          });

And entities added later do not trigger new Id allocation,

How should we improve?

Temporarily, we can start a transaction in function put and allow nested transactions ( if we want to start a new transaction and found that current thread is already in a transaction, we would not start a new one, see below)

   @Override
  public <R, E extends Exception> R executeInTransaction(Executable<R, E> executable)
      throws E, IOException {
    Already in tx, directly execute it without creating a new tx
    if (TX_LOCAL.get() != null) {
      return executable.execute();
    }

    Transaction tx = db.beginTransaction(new WriteOptions());
    LOGGER.info("Starting transaction: {}", tx);
    TX_LOCAL.set(tx);
     ....
    // code logic
  }

For long time, we should refactor the key encoding logic and get rid of dependence on transactions

jerryshao commented 1 year ago

I suggest doing a whole refactoring work on this store part immediately. Currently the organization of code seems messy, which mixed several things together, like:

  1. entity id generating logics
  2. name and id mapping logics
  3. the logics of kv store, like get, put, update and so on.

The patched fix seems just like a workaround, not a thorough fix. For example:

  1. The management of transactions mixed with thread local variables has many limitations, and may potentially lead to issues if the variable is not cleaned.
  2. The id generation and name-id mapping logics should be put in a transaction, whereas we cannot sure if the code is trapped in the transaction or not (depends on the caller), which is super vulnerable.
  3. The entity store interface is not clearly decoupled with other modules, which makes changes placed everywhere if we want to add a new entity type.
  4. The rocksdb has many configurations like off-heap memory size, currently all we used is default one, we'd better expose some necessary configurations to the users.
jerryshao commented 1 year ago

@yuqi1129 think carefully about the problems I mentioned above, and provide a whole refactoring design when ready. What I want is thorough design doc before any code change.