MinaFoundation / mina

Not the official repo. See https://github.com/MinaProtocol/mina
https://github.com/MinaProtocol/mina
Apache License 2.0
1 stars 0 forks source link

Implement API to retrieve blockchain namespace #39

Open kantp opened 1 year ago

halsaphi commented 1 year ago

Extend the existing graphQL API to retrieve a blockchain identifier.

The graphQL should provide the ability to request and return the network_identifier.network value used in src/app/rosetta/lib/construction.ml. The blockchain identifier is the value stored in network_identifier.network (i.e. mainnet, testnet, etc)

halsaphi commented 1 year ago

additional information :

The blockchain identifier can be used as part of a general blockchain identification scheme or CASA namespace [a concatenation of mina:], this API support the resolution method for verifying a CASA namespace.

CASA Namespace proposal - https://docs.google.com/document/d/1PJm1pt2pj0dA6JYGP2q6XwpcbxPfow0_x2Qjp__Bwgg/edit?usp=sharing

PR showing - network_identifier.network - https://github.com/MinaProtocol/mina/pull/13371 - basis for the blockchain identifier.

kantp commented 1 year ago

Let's start this with an investigation of how much work this will be, before starting the actual work, so we can properly prioritise.

Sventimir commented 1 year ago

At the moment Rosetta server obtains network_identifier.network field from an environment variable, so the daemon knows nothing about it. There's no way to pull that information from Rosetta to the daemon. What we need to do instead is to move the information to the runtime config of the daemon. From there GraphQL will be able to return it and thus Rosetta will be able to obtain it via GraphQL, discarding the env variable.

While I'm not sure if we can alter the contents of the config file for mainnet just like that, we certainly can add that information during the hard fork. We could also add a CLI argument as an alternative way to pass that identifier to the node. So we need to:

If we add this information to the post-hard-fork config only, though, we'll need a way to inject the name into the generateed genesis ledger config. We can also assume the default value being mainnet or something in order to avoid making the field required.

So we need to answer the following questions:

In any case I think this should be a relatively easy thing to do (although one requiring changes in a lot of places).

Sventimir commented 1 year ago

Also another question (might be silly): how bad it would be if different nodes in the same network returned different values in response to this query? Do we need to make sure on the protocol level that nodes returning wrong value are kicked out from the network?

vfrsilva commented 1 year ago

Hi team, Having the environment stored in config will make a chance of missing/misconfiguring the value.

What if we have a translation table from chainID and the human-readable string? The requirement is that the CASA namespace is available in a GQL query.

Could you please estimate how long it will take to implement such a solution?

Thank you

halsaphi commented 1 year ago

How is network_identifier.network currently set?

This should not require new config, the graphQL should only return the network_identifier.network value from a running node.

The extension to graphQL should be in the graphQL API for the node (Rosetta may be able to use this but is not the focus of this PR.)

Re the question "how bad it would be if different nodes in the same network returned different values in response to this query?" - how would this be possible? It seems to me that it shouldn't be possible - the node is part of a network and this value identifies the network it belongs to.

Sventimir commented 1 year ago

Currently there is no network identifier in the daemon whatsoever. It's purely a Rosetta thing and it's defined for Rosetta through an environment variable. Daemon does not know about it and cannot access it at the moment. At the daemon level there's only the chain id, which identifies a network in secure way, because it's tied down to the protocol. Any other, human-readable identifier will be arbitrary, not derrived from the protocol and therefore susceptible to tampering with.

A solution to this would be to create a mapping from chain ids to human-readble identifiers, but this shouldn't be a part of the node. One does not want to recompile and redeploy the daemon whenever a new network is being started. This mapping should live outside the node in some web service or somewhere.

Alternatively, we can put the network id in the daemon's configuration. Admittedly, I'm not entirely sure, how the chain id is being computed. I think it is a hash of the network configuration or at least of some relevant constants. Maybe if we put the network identifier in the config, it'll automatically affact the chain id, in which case we can include it at the time of the hard fork (no sooner, no later). If it doesn't affect the chain id, though, any operator can put their own thing in there, which may or may not be acceptable, hence my question.

Sventimir commented 1 year ago

For the record: at the moment it's completely fine if 2 separate Rosetta instances return different network identifiers, even if they're connected to the same node and archive. The value is just taken from the environment variable without any verification. So each Rosetta operator can put their own thing in there and noone will notice.

Sventimir commented 1 year ago

It turns out this issue is related to #52, where we also need to a mechanism for synchronising the behaviour of Rosetta and the daemon. The same network identifier configuration field can be used for this. This strengthens my conviction that the network identifier should be defined by the configuration file for the network.

kantp commented 1 year ago

Testing this is currently blocked on #52