planetarium / libplanet

Blockchain in C#/.NET for on-chain, decentralized gaming
https://docs.libplanet.io/
GNU Lesser General Public License v2.1
512 stars 144 forks source link

Fix a `ChainIdNotFoundException` when transaction was never run #2047

Closed Atralupus closed 2 years ago

Atralupus commented 2 years ago

https://github.com/planetarium/libplanet/blob/main/Libplanet/Blockchain/BlockChain.cs#L1192 If it ran PreloadAsync after creating a clean blockchain where no transactions have been created, ChainIdNotFoundException occurs. When no transactions are created, I think need to skip

greymistcube commented 2 years ago

There are mainly two places where ChainIdNotFoundException is thrown.

When BlockChain<T>.Fork() is called, it calls on both IStore.ForkBlockIndexes() and IStore.ForkTxNonces(), where in which ChainIdNotFoundExceptions gets thrown when a collection mapped to BlockChain<T>'s Id cannot be found.

It just so happens coincidentally, by design, as BlockChain<T> requires a genesis Block<T>, IStore.ForkBlockIndexes() will never fail as at least one Block<T>, namely the genesis, will get added, thus creating if necessary, to a collection on the fly. On the other hand, if an instantiated BlockChain<T> does not have any Transaction<T>, IStore.ForkTxNonces() will fail since such collection never gets created.

I would say a proper way to resolve this would be to create necessary empty collections in anticipation of data getting added later, rather than creating them on the fly.

P.S. I don't think this is a good first issue. 😶

Atralupus commented 2 years ago

There are mainly two places where ChainIdNotFoundException is thrown.

When BlockChain<T>.Fork() is called, it calls on both IStore.ForkBlockIndexes() and IStore.ForkTxNonces(), where in which ChainIdNotFoundExceptions gets thrown when a collection mapped to BlockChain<T>'s Id cannot be found.

It just so happens coincidentally, by design, as BlockChain<T> requires a genesis Block<T>, IStore.ForkBlockIndexes() will never fail as at least one Block<T>, namely the genesis, will get added, thus creating if necessary, to a collection on the fly. On the other hand, if an instantiated BlockChain<T> does not have any Transaction<T>, IStore.ForkTxNonces() will fail since such collection never gets created.

I would say a proper way to resolve this would be to create necessary empty collections in anticipation of data getting added later, rather than creating them on the fly.

P.S. I don't think this is a good first issue. 😶

Thank you for your explanation and I've removed the GFI tag 👋

greymistcube commented 2 years ago

See #2126 for more context.

greymistcube commented 2 years ago

https://github.com/planetarium/libplanet/blob/eb50da96b9d85b4d070dd5a0b3754ba1c39a0e22/Libplanet.Tests/Blockchain/BlockChainTest.cs#L62

Seems like this issue runs deeper and is more critical. Trying to use DefaultStoreFixture or RocksDBStoreFixture instead of MemoryStoreFixture results in quite a number of tests breaking. 😶

tkiapril commented 2 years ago

I had a look at the issue, running BlockChainTest with both DefaultStoreFixture and RocksDBFixture, and apparently the base cause of the issue is not internal to how libplanet handles IStore but related to each of the implementations.

Focusing on the ForkTxNonces part of the issue:

For DefaultStore, the cause of the issue is that LiteCollection<BsonDocument>.Exists, the mechanism behind determining if the chainid exists, is true only if the query returns anything at all. Due to this, if there exists no entry at all, it would always return false. Also, apparently LiteDatabase.GetCollection already creates a new empty collection according to the documentation, so creating a dummy entry and removing it wouldn't really help the situation.

Similarly for RocksDBStore, the implementation looks for a key starting with TxNonceKey(sourceChainId) prefix. If there exists no key with the given prefix, the implementation will consider the chain id nonexistent. Thus, with the current implementation, there must exist a Tx for the chain id to be considered existent.

In both cases, we need another mechanism to determine if the chain id exists. One thought is to iterate through the block hashes to see if anything exists. There must exist a genesis block for the chain to ever exist, so this might be a viable reasoning. The other is to have a collection or entries of existing chain ids, but this might be a burden to migrate since the IStore specification must be amended.

For ForkBlockIndexes, I think it wouldn't be a problem since it wouldn't make sense semantically to have a chain id (which means the chain exists) with no blocks associated with, as there must be a block for the chain to exist at all. For this reasoning, I don't think a fix is required.

Can someone who have had a touch on this issue give me an insight about how I should move along, both for the mechanism of checking chain existence, and whether if we should worry about ForkBlockIndexes? @Atralupus @greymistcube

greymistcube commented 2 years ago

Eh, I haven't really dug deeper into implementations. From what I gather, both implementations do not seem to distinguish between an empty collection and non-existent collection. That is, in a way, non-existent collection is treated as an empty collection.

In any case, both IStore implementations for ForkTxNonces() are faulty as they stand now. TxNonceCollection collection not existing (or empty, whatever you want to call it) is not a sufficient condition to raise an Exception, as clearly there are allowed database states where TxNonceCollection is allowed to be non-existent.

Unlike some other collections, TxNonceCollection is logically derived from other entries, so we only need to make sure that its state is consistent with Transaction<T>s in the database. As you have already mentioned, actually checking its consistency against the source BlockChain<T> data is computationally expensive. I'd say just get rid of the sanity check altogether, as the logic itself is invalid[^1]. I'd ask @dahlia for additional comments on this point. 🙂

As for ForBlockIndexes(), it could be personal preference thing, but I would say this is just another point of logical coupling. In my opinion, IStore shouldn't really care about what valid BlockChain<T> entities should look like unless it is necessary. It just so happens the current implementation does not allow handling of empty BlockChain<T>s. That is, I see this as a case of the implementation details of BlockChain<T> leaking into the implementations of IStore. I'd say a user should be able to fork an "empty" chainId if he/she wants to without any adverse effect, even though it would never get used. 😶

[^1]: We can safely do away with sanity checks if we are confident enough that IStore is always consistent, but I don't think its API is completely free of state-consistency-breaking methods. 🙄

tkiapril commented 2 years ago

So would you say that we can get away with removing ChainIdNotFoundException altogether (including in MemoryStore), as TxNonceCollection is a derivation of the BlockChain state, and a nonexistent chain (which might be analogous with a chain without any block) might also be "forked"? (should we even consider a nonexistent chain analogous with an empty chain?)

One logical concern I can think of is that ForkBlockIndexes() asks for a branchpoint, which cannot exist without a block. What should happen in such a case where you fork an empty chain? Would it be to throw an error if sourceChain is empty and branchpoint is not null, and do nothing if branchpoint is null?

dahlia commented 2 years ago

If we're willing to treat IStore as a mere low-level primitive IMHO neither ForkBlockIndexes() nor ForkTxNonces() have to check if a source chain exists. Instead, we could add a dedicated method like HasChainId(), and make caller responsible to check its existence before forking a chain.

greymistcube commented 2 years ago

One logical concern I can think of is that ForkBlockIndexes() asks for a branchpoint, which cannot exist without a block. What should happen in such a case where you fork an empty chain?

Well, we can already pass in a BlockHash as branchpoint that is not in the chain, can we not?