Implement Rosetta RPC - Githubissues

frol commented 4 years ago

This is a high-priority request from a partner.

Rosetta is a public API spec defined to be a common denominator for blockchain projects.

As per the discussion (find some pieced below), we are going to build Rosetta RPC alongside with JSON RPC into nearcore under a feature-flag.

Rosetta Docs

Node API Specification: https://github.com/coinbase/rosetta-specifications
Tooling: https://github.com/coinbase/rosetta-cli
Docs: https://djr6hkgq2tjcs.cloudfront.net/docs/Introduction.html
Celo Implementation: https://github.com/celo-org/rosetta
Oasis Implementation: https://github.com/oasislabs/oasis-core-rosetta-gateway

Verify the API compliance with: rosetta-cli check --server-url=<node>

@frol:

I have reviewed the API they want and it is going to be a bit hacky translating our terms into Rosetta (they use transaction hash to query the information about the transaction while we require transaction hash + account id due to sharding requirements). Illia suggested requiring the node to track all the shards to be able to serve Rosetta requests. With this assumption, we can implement querying transactions just by their hash, though that might create some confusion if we have JSON RPC and Rosetta RPC side-by-side.

Their auto-generated stub in Go looks fine for internal use (it seems that the input validation is missing), but I don't have experience with Go to take it over. Thus, I think that implementing Rosetta RPC right into the nearcore is the fastest way to go, and also the most maintainable. I estimate it to be a week-long effort for me.

@ilblackdragon:

Let's go ahead with requirement on having tracking all shards.

We can have this API enabled with a config flag for now and it checks that all shards are tracked as well, so it's less weird.

This is a priority as we have a large portion of our financial backers waiting to use Custody and blocks our transition to Phase 1 -> 2.

bowenwang1996 commented 4 years ago

@frol shouldn't it be P0?

frol commented 4 years ago

I left some room for the “on fire” P0 items.

eriktrautman commented 4 years ago

Checking in on this since we scoped it for < 1 week. Are we ready to hand back to Coinbase?

frol commented 4 years ago

Sorry, we are not. I am behind the schedule due to the relocation back home from Argentina.

eriktrautman commented 4 years ago

How is this going? We're trying to get a timeline from them but this integration is actually still the blocker.

damons commented 4 years ago

Any update here?

frol commented 4 years ago

The implementation is there, but Rosetta checker is not happy yet. I am currently resolving the mismatches between my understanding vs Rosetta expectations (working with Patrick from Coinbase on these).

damons commented 4 years ago

Moving to Phase 2. Will re-address when we get there.

frol commented 4 years ago

Q:

In NEAR, the transaction gets included in the block X, but it is only applied by the block X+1, so when the transaction that creates an account occurs, you won't be able to query the account just yet on that exact block height, it will be created after the block X, so you have to query block X+1, though Rosetta wants to query the account immediately

A:

Rosetta is very tightly coupled to the principle that “the block where a transaction shows up is where it is executed/applied”. I created an issue to improve our documentation here. We ran into a similar modeling concern with Filecoin (who also has this executed in X + 1 abstraction) and our guidance was to include transactions in the blocks where they were executed/applied, not in the block where it originally showed up (before it was executed).

@bowenwang1996 @nearmax @SkidanovAlex Implementation-wise, it is absolutely possible to attach the actions from the parent block instead of the current one, but I wonder how we communicate this mismatch with Explorer and the rest of the tooling... I can add any data into the metadata attached to the response, so I can have an extra field like included_in_block_height and executed_in_block_height. Does this sound reasonable to you?

MaksymZavershynskyi commented 4 years ago

The problem that we have is that our transactions is complete only when all of its receipts have completed, and they are all gradually executed through the span of several blocks and not just in the last block. So I suggest: included_in_block_height, completed_in_block_height. WDYT?

On Jul 13, 2020, at 2:48 PM, Vlad Frolov notifications@github.com wrote:

Rosetta is very tightly coupled to the principle that “the block where a transaction shows up is where it is executed/applied”. I created an issue to improve our documentation here. We ran into a similar modeling concern with Filecoin (who also has this executed in X + 1 abstraction) and our guidance was to include transactions in the blocks where they were executed/applied, not in the block where it originally showed up (before it was executed).

@bowenwang1996 https://github.com/bowenwang1996 @nearmax https://github.com/nearmax @SkidanovAlex https://github.com/SkidanovAlex Implementation-wise, it is absolutely possible to attach the actions from the parent block instead of the current one, but I wonder how we communicate this mismatch with Explorer and the rest of the tooling... I can add any data into the metadata attached to the response, so I can have an extra field like included_in_block_height and executed_in_block_height. Does this sound reasonable to you?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nearprotocol/nearcore/issues/2738#issuecomment-657810328, or unsubscribe https://github.com/notifications/unsubscribe-auth/AILKVB43PC7RRAS7ZYOGBITR3N6J3ANCNFSM4NMO3NOA.

bowenwang1996 commented 4 years ago

@frol

In NEAR, the transaction gets included in the block X, but it is only applied by the block X+1

This is not correct. Transactions included in block X are executed when we apply block X. I see two potential confusions that might lead to your conclusion above:

The fact that transactions are not applied when the chunk is produced does not mean they are not applied when the block that the transaction is included is applied.
Transactions generate receipts and those receipts are usually not applied in the same block. This is definitely true, but I think this should not stop us from saying that the transaction is executed in block X. I am not familiar with rosetta. If a transaction is executed in block X, does it mean that all its execution (including receipts) have to finish in block X?

MaksymZavershynskyi commented 4 years ago

@bowenwang1996

This is not correct. Transactions included in block X are executed when we apply block X.

Could you define what is "application of block X", and when according to the protocol it happens?

This is definitely true, but I think this should not stop us from saying that the transaction is executed in block X. I am not familiar with rosetta.

I would use light client to define when information is available. Information is available when it provably exists (no need for finality, since we can state that certain information provably exists but it was not globally agreed upon yet). Information that transaction was included is available at block X and information that this transaction has outcome Z is available at block X+1.

bowenwang1996 commented 4 years ago

Could you define what is "application of block X", and when according to the protocol it happens?

I think it is a bit hard to formally define it. Informally, for a node the application of the block consists mainly of the following two things:

update information of the chain and attest to the validity of the block if the node is a block producer.
perform state transition on the shard(s) that the node is assigned to, if the node is a chunk producer or cares about the shard(s).

From the protocol point of view, it happens when a node receives the block and have all the relevant information to apply it (chunk parts if the node is a block/chunk producer or simply cares about some shard). We can also separate the application of a block from application of a chunk, but I intend to think that considering them together as one process makes more sense.

frol commented 4 years ago

Well, let's solve the specific problem with Rosetta. Rosetta watches for the transactions and receipts (all merged into a single entity "transaction" for Rosetta); once it observes a transaction, it wants to fetch information about the accounts involved in the transaction and fails to do that when we have a CreateAccount action, which only get executed with the receipt. So far it seems that our transactions should not expose actions (maybe expose them in free-form metadata).

(BTW, Rosetta seem to want us to split a single TRANSFER action into two "operations": TRANSFER -10N from Alice & TRANSFER +10N to Bob, and also include the fees and rewards expressed in their terms of "operation", but we don't create any transaction for rewards...)

bowenwang1996 commented 4 years ago

So far it seems that our transactions should not expose actions (maybe expose them in free-form metadata).

Why does it matter? Is there also a concept of action in Rosetta?

frol commented 4 years ago

Small update to bring up some knowledge about Rosetta RPC core focus:

Rosetta RPC was mainly designed to expose "balance-changing" events on blockchains. Thus, @evgenykuzyakov provided us with a complete list of balance-changing events in NEAR protocol:

Genesis balance
Signed transaction (for signer_id)
- one charge for the total of gas, attached_gas (prepaid), and attached deposits
Receipt (for receiver_id)
- reward for contract execution (on function calls)
- receiving attached deposit
- stake action (amount -> locked_amount)
- function call makes transfer (decreases amount)
- an action changes storage (some amount for storage is locked/unlocked)
- deleted account (locked storage + unlocked balance)
Validator update
- kickout event or decreased stake: locked_amount -> amount
- reward: increases locked_amount

evgenykuzyakov commented 4 years ago

Links to Rosetta docs relevant to account balances:

I think we can distinguish 3 types of balances:

liquid - the balance on account (no sub-account).
locked - the amount locked for staking. (sub-account # 1)
liquid_for_storage - the amount of non locked balance used to cover the remaining storage balance (sub-account # 2). Let’s say you need 30 NEAR for storage, and locked balance is 20NEAR. It means you ONLY need 10 NEAR to cover the remaining balance for storage. So liquid_for_storage in this case will be 10 NEAR.

Rosetta model with balance checks and sub-accounts complicates our implementation because you need to compute storage_balance before you can compute liquid_for_storage . This is fine, but now every action that modifies any account value can trigger sub-account operation change.

frol commented 4 years ago

Rosetta model with balance checks and sub-accounts complicates our implementation because you need to compute storage_balance before you can compute liquid_for_storage. This is fine, but now every action that modifies any account value can trigger sub-account operation change.

I don't see any workaround unless we give up and hide the storage_balance. Otherwise, it will overwhelm users with the number of transfers... :thinking:

UPD: Well, we still have transfers with every single transaction to pay the gas usage, so it is not that worse :man_shrugging:

MaksymZavershynskyi commented 4 years ago

Removing from Phase 2, since it is not related.

frol commented 4 years ago

Construction API is not implemented yet.

near / nearcore

Implement Rosetta RPC #2738