CAIP-19 for EIP155 AssetID should be more restrictive

ChainAgnostic / namespaces

Chain-Agnostic Namespaces host informative specs and profiles of CAIPs (Chain-Agnostic Improvement Proposals) per blockchain ecosystem

https://namespaces.chainAgnostic.org/

53 stars 64 forks source link

CAIP-19 for EIP155 AssetID should be more restrictive #76

Closed haardikk21 closed 1 year ago

haardikk21 commented 1 year ago

The generic CAIP-19 spec takes a very generous stance to what a tokenId should look like when talking about Asset ID's. Basically anything that is between 1-78 characters long flies (alphanumeric, hyphen, underscore, periods)

The EIP155 namespace spec however is only concerned with ERC-721 - which necessitates a uint256 type for tokenId. It should not allow alphabets, periods, hyphens, underscores, or periods - and simply be a numeric Regex string for 1-78 characters long.

Courtesy of ChatGPT, here's a Regex that seems reasonable:

^(?=.*\d)[\d]{1,78}$

^ asserts the start of the string.
(?=.*\d) is a positive lookahead assertion, ensuring that at least one digit is present in the string.
[\d]{1,78} matches any digit character (0-9) between 1 and 78 times.
$ asserts the end of the string.

If this looks good, happy to make a quick PR to fix.

bumblefudge commented 1 year ago

1.) the spec specifically mentions erc20 and should probably also mention erc1155 and any future EIPs that get to erc status defining a token... that portion needs to be validated and have a future-proof reference to future ERC asset standards

2.) should that be ^(?=.*\d)[\d]{78}$ ? is "0x0" a valid uint256?

bumblefudge commented 1 year ago

jump in here @sneh1999, you should probably get the PR/git-commit credit for identifying the issue!

haardikk21 commented 1 year ago

We can fix it to 78 characters if we expect left-padded token id's for everything.

IMO, If we want to future-proof against any other asset standards that may come up, I think 1-78 is fine and if someone needs a 78 character uint256 they can left-pad it themselves.

Most RPCs, APIs, Indexers etc don't pre-pad for you, we should leave that up to the client

haardikk21 commented 1 year ago

1.) the spec specifically mentions erc20 and should probably also mention erc1155 and any future EIPs that get to erc status defining a token... that portion needs to be validated and have a future-proof reference to future ERC asset standards

2.) should that be ^(?=.*\d)[\d]{78}$ ? is "0x0" a valid uint256?

Oh also re: the spec mentioning erc20 - im specifically talking about the 'AssetID' portion of the spec here. ERC-20 falls under 'Asset Type'

'Asset ID' = 'Asset Type' + Token ID

It is the Asset ID portion of the namespace spec which needs a more restrictive RegEx, the Asset Type is fine as it is

Sneh1999 commented 1 year ago

Thank you @bumblefudge .

@haardikk21 could we potentially simplify your regex to just [0-9]{1,78}?

I don't think we need (?=.*\d) as a positive lookahead assertion because {1,78} guarantees that there is at least one digit.

bumblefudge commented 1 year ago

I should probably have known that about the padding digits!

bumblefudge commented 1 year ago

OK you convinced me on the {1,78} versus {78} thing.

Let's just address the ERC-20 thing then. Note that there regex for TokenID should include the option /1234 at the end (additional info if that particular Asset Type supports it, like serial numbers for 721 or god knows what for erc1155 or future ercs)

haardikk21 commented 1 year ago

here's my proposal:

General CAIP-19 (from the spec):

asset_id: asset_type + "/" + token_id 
asset_type: chain_id + "/" + asset_namespace + ":" + asset_reference
chain_id: Namespace+Blockchain ID as per CAIP-2

asset_namespace: [-a-z0-9]{3,8}
asset_reference: [-.%a-zA-Z0-9]{1,128}
token_id: [-.%a-zA-Z0-9]{1,78}

Proposed EIP-155 CAIP-19:

asset_id: asset_type + "/" + token_id 
asset_type: chain_id + "/" + asset_namespace + ":" + asset_reference
chain_id: Namespace+Blockchain ID as per CAIP-2

+ // Enforce 'erc' at the beginning, followed by 2-5 more alphanumeric chars
asset_namespace: erc[a-z0-9]{2,5}
+ // EIP-155 addresses that start with 0x. The checksum cannot be enforced in RegEx.
asset_reference: 0x[a-fA-F0-9]{40}
+ // Token ID uint256 
token_id: [\d]{1,78}

bumblefudge commented 1 year ago

go team!

bumblefudge commented 1 year ago

PR that regex in and get the commit credit for it :D

Sneh1999 commented 1 year ago

@bumblefudge Feel free to review https://github.com/ChainAgnostic/namespaces/pull/79

bumblefudge commented 1 year ago

go team!