Getting AccountId/ContractId from an evm address is confusing after HIP-583

petreze commented 1 year ago

Problem

After the implementation of HIP-583 and the introduction to evmAddress there are several issues that confuse users and need to be addressed. After this HIP, we now have 2 formats of evm addresses - the so called account-num alias (or long-zero format address) and the evm address alias (references taken from the hip doc).

The issues are as follows:

Before the introduction of the new evm address alias, users were using the single method AccountId/ContractId.fromSolidityAddress(address: string) which basically parses the provided 20 bytes to shard - first 4 bytes, realm - next 8 bytes and num - last 8 bytes and generates a proper AccountId/ContractId. After the introduction of the HIP, some users are trying to pass the new evm address alias to the same function and since this field is generated as in Ethereum (rightmost 20 bytes of the 32 byte Keccak-256 hash of an ECDSA public key), the algorithm for parsing the account-num alias can yield enormous or even negative values for each of shard, realm and num fields. This is why users should use the new method fromEvmAddress(). An example - here the passed address is of evm address alias and respectively if you use the wrong function - fromSolidityAddress() you get wrong result
The second one, following the same example is that if you try getting the AccountId.fromEvmAddress(0, 0, "0xdfc61db3604e254611827ecf0d42e6f4b2e256ac") with the correct function, the result would be AccountId with zeros as shard, realm and number because we cannot extract the num from the passed evm address
```
AccountId {
shard: Long { low: 0, high: 0, unsigned: false },
realm: Long { low: 0, high: 0, unsigned: false },
num: Long { low: 0, high: 0, unsigned: false },
aliasKey: null,
evmAddress: EvmAddress {
_bytes: <Buffer df c6 1d b3 60 4e 25 46 11 82 7e cf 0d 42 e6 f4 b2 e2 56 ac>
},
_checksum: null
}
```
So using this method, we set 0 by default to the num field and pass the shard and realm provided from the method call. Which is confusing because if you try to query the mirror node you will get the actual AccoundId fields, in this case 0.0.1171 as @mgarbs expects

Solution

Proposals for the issues:

Here we thought of 2 approaches and we have to decide which one to implement
- Introduce a 3rd more generic method e.g. fromAddress(), add check whether or not it is a account-num alias and navigate the flow to the proper method based on the format of the passed address
- Or add those checks inside the present methods - fromEvmAddress() and fromSolidityAddress() and again based on the result, pass the flow to the proper method of these two
- 3rd option would be to do both - add the checks to each of those methods, add fromAddress() to handle both evm address formats and lastly to deprecate the present two methods in favour of fromAddress()
Here an approach would be to query somehow the mirror node and get the data we need but the tricky part is that in this part we do not have access to a client or something. This is just parsing functionality where we just try to generate an object from an input. We either need to specify the network to know which mirror node to call and get the info or add a client and to be honest, both approaches wont be a good user experience...

Alternatives

@SimiHunjan @gregscullard @ochikov @Nana-EC

### Tasks

petreze commented 1 year ago

Note: This PR does the second sub point to the first issue! It only adds the checks inside the present methods and navigates the flow based on the input.

Next steps should be to decide whether or not to create the generic method fromAddress() (and if we will deprecate the present ones), since the present methods have different inputs and how do we handle that?
Also the fact that fromEvmAddress() sets a default 0 value for the num field, despite the fact that if you query the mirror node for the same evm address, you will get the actual num value.

mgarbs commented 1 year ago

I am in favor of the one size fits all fromAddress that you talked about as the third option and then depreciation of the other methods. It simplifies everything and just works. Here here.

Nana-EC commented 1 year ago

My thoughts

Add a fromEvmAddress(address) and fromEvmAddress(shard, realm, address) - former assumes 0 shard and realm. Both cases set a null not a 0 num
Deprecate the other 2 (fromEvmAddress() and fromSolidityAddress()) after some time
Explore a future Mirror Node query using api/v1/accounts/{address}. To be clear I'm not saying as part of fromEvmAddress but rather as a MAPI query, similar to HAPI queries.

pathornteng commented 1 year ago

In my opinion,

fromAddress is a bit too generic and might confuse the user. I agree with @Nana-EC, it should be fromEvmAddress and we should deprecate fromSolidityAddress. fromEvmAddress should return 0.0.Number and then 0.0.evmAddress if the first one if not possible
I think we need to have a way for the user to convert EVM address to account/contract id through SDK. Maybe we can implement client.fromEvmAddress so you don't need to worry about specifying the network for now.

SDK talking to the mirror node is becoming unavoidable. The mirror node currently supports smart contract state queries now so I think that in the future the SDK should switch to query smart contract state from the mirror node instead of the consensus node so the users don't need to pay for the transaction fee.

david-bakin-sl commented 1 year ago

Agreeing with above opinions that fromSolidityAddress should be dropped in favor of fromEvmAddress overloads.
Also consider the following code. I assume developers are going to be doing this themselves a lot now? Why don't we do it for them with a single SDK call?

// identifying prefix for account-num alias (aka long-zero)
const LONG_ZERO_PREFIX = '0x000000000000000000000000'
let contractId;
// given a 20 byte string contractAddress of form 0x....
if (contractAddress.startsWith(constants.LONG_ZERO_PREFIX)) {
    // assumes 0 value shard and realm and converts adddress to num
    contractId = ContractId.fromSolidityAddress(contractAddress);
} else {
    // uses hex address instead of num in id
    contractId = ContractId.fromEvmAddress(0, 0, contractAddress);
}

petreze commented 1 year ago

One thing to consider in general is that in JS we do not have method overloading. So this is the reason why we ended up with so many different method signatures. The same thing should be considered here plus the fact that (I assume) we do not want to make any breaking changes, meaning that present methods fromEvmAddress() and fromSolidityAddress() should stay the same and only add handlers there if any and add new method for example fromAddress() and design it in a way that would handle all cases (because we are kind of running out of free method signatures already :smile:)

@david-bakin-sl On your second point -> this check is already added to both methods fromEvmAddress() and fromSolidityAddress() so that even if users pass the wrong format address, the flow will be passed to the correct function

On the topic about getting the actual account num from the mirror node - I would agree that we should enhance and make a query which calls the mirror node and gets the actual num of the AccountId @Nana-EC @pathornteng

petreze commented 1 year ago

if we have generalized method fromAddress(evmAddress, shard = 0, realm = 0) note: input parameters should be reordered if we want for shard and realm to default to 0 because if they are before evmAddress (in JS) they cannot be optional and users should always pass a value for them
- if evmAddress is a long-zero format → pass to fromSolidityAddress() method which parses the bytes to a proper shard, realm, num format for AccountId
- otherwise generate AccountId with shard and realm from the input (if not set - assign 0 as default value) and 0 for num field and evmAddress from the input
  - log a message or add to method description that a 0 will be set for the num field and the user should query the mirror node for the actual num field. We can even throw an exception if a user tries to use the generated AccountId without the proper num value i.e "Account num is null, please query the mirror node and get the actual value"?
design the Mirror node call
- can take evmAddress as a string or AccountId object and query the mirror node with execute(client) so that we can get the network from the client and query the proper mirror node and after getting the result → automatically set the num from the mirror node to the num field of the passed AccountId object. An open question here would be if we add this functionality to the AccountId/ContractId objects as a method or a separate call... IMO an object method would be better

@mgarbs @Nana-EC @pathornteng @david-bakin-sl

david-bakin-sl commented 1 year ago

(w.r.t. reordering the parameters so shard+realm can be defaulted to 0: I suggest they be reversed from the standard definition to be address then realm then shard. It's already the case that the three components are all integers, the order is easily confused, IMO reverse is better than rotate.)

(Although this JS consideration of no overloading yet default-to-zero is painful in this regard: this signature with 3 consecutive longs, where we have to reverse it from the standard. Typing doesn't help so you can't just say it works fine in TypeScript. High possibility for confusion in the future. IDE hints or not. Gosh. Maybe two separate names is better, just not the ones we've got. And we don't even have shards+realms yet. Sigh. Something like fromAddress (1 arg version) and fromFullAddress or fromFullyQualifiedAddress (3 arg version)? (Whatever the name we give officially to the "full" shard+realm+number if we need to distinguish it from the current 0+0+number.) I dunno. This might be something ... requires even more consideration of the user POV over time?)

(Yet another possibility: use shard+realm+number in that order and everyone always has to type ...(0,0,addr) and that's just the way it is. Why not? Everything else we document and put into production (e.g., hashscan) always uniformly puts the 0.0.n version out there, never elides the 0.0. prefix. How hard would it be to just type the 0,0, when you need it? It would sure eliminate this confusion.)

(Proves once again that little-endian is the proper choice for almost everything, networking precedent aside ...)

izik1 commented 1 year ago

Gentle nudge on the conversation: Not all languages can do overloads (Namely Rust in my case, Go too out of the SDKs we have) nor default parameter arguments (again Rust and likely Go), although https://github.com/hashgraph/hedera-sdk-reference/issues/128#issuecomment-1662247234 does seem to at least partially address this.

petreze commented 1 year ago

@pathornteng @Nana-EC @izik1 @david-bakin-sl @mgarbs Continuing the conversation from above, currently, we do have:

handlers in both methods so that the flow can be passed to the correct method
MAPI call that queries the mirror node so that a user can get his actual num field when AccountId/ContractId is generated from an actual evmAddress

Shall we add a generic method fromAddress()? It would be pretty hard to unify this method in languages throughout all SDKs, reasons:

No overloads -> JS, Rust, Go No default/optional parameters -> Rust, Go

In JS we can design it like this: fromAddress(address, shard = 0, realm = 0) which will handle both evm address formats in Hedera..

Or we can leave it like this and just deprecate fromSolidityAddress() since fromEvmAddress() can handle both evm formats now!

Nana-EC commented 1 year ago

From an API design standpoint if we've addressed the issue and it scales I think we're good as is and should just deprecate fromSolidityAddress() since as you said fromEvmAddress() can handle both evm formats now.

pathornteng commented 1 year ago

In my opinion, fromAddress is a bit too generic in term of naming. It would also cause a confusion.

hashgraph / hedera-sdk-reference