esaulpaugh / headlong

High-performance Contract ABI and RLP for Ethereum
Apache License 2.0
79 stars 20 forks source link

Keep internal type from ABI files #56

Closed xaviarias closed 1 year ago

xaviarias commented 1 year ago

I'm working in a code generation project and would like to generate the data structures for tuples instead of using Tuple.

Would it be possible to parse and keep the internalType JSON field in the ABI objects?

Thanks

esaulpaugh commented 1 year ago

Do you mean can headlong be updated to retain the internalType information after parsing the JSON?

I'm hesitant to implement anything that isn't specified in the ABI specification itself, but this might be a reasonable exception.

Are you just looking for a getInternalType method on ABIType? In v7.0.0 https://github.com/esaulpaugh/headlong/releases/tag/v7.0.0 I removed the name field from ABIType. This removal allows headlong to return the same immutable ABIType instance for all calls to TypeFactory.create (for most base types at least). Meaning, it eliminated the need to create a new object every time to accommodate different names. This had a positive impact on performance and somewhat simplified the code.

If I were to add an internalType field to ABIType, it would undo this desirable property.

What I did with name, though, was store the type names in TupleType (which can't be represented by a single instance anyway). I added a method TupleType#getElementName(int) which would accept the index of the element type and return that type's name. Would a similar method TupleType#getElementInternalType(int) work for your use case?

xaviarias commented 1 year ago

Thanks @esaulpaugh for your reply.

Do you mean can headlong be updated to retain the internalType information after parsing the JSON?

That was the idea, so it's possible to generate a data class for each struct.

This had a positive impact on performance and somewhat simplified the code.

This makes total sense, but for my use case performance is not much of a concern.

Would a similar method TupleType#getElementInternalType(int) work for your use case?

Yes that would perfectly work for my use case. I was thinking this method could return either a simple string, but better a constructed ABIType since the internal type can be the same as TupleType#get(int), but also a struct or a struct array, eg. struct MyContract.MyStruct[].

Roughly I was thinking a specific type for structs, something like StructType<Tuple> extending TupleType that also contained a name and namespace, but I'm not sure about how namespaces are managed by the Solidity compiler. So ideally internalType could be also parsed so name and namespace are already separated.

The case where the struct is inside an array would fit in your model for ArrayType<StructType<Tuple>>.

Would that make sense for you?

esaulpaugh commented 1 year ago

If the method were to return an ABIType object, for something like struct BunchaStructs.A[], which I think would be of type Array<StructType<Tuple>>, it's not totally obvious what I would do about the canonicalType field.

I suppose the canonicalType for the StructType would be struct BunchaStructs.A. In which case the ArrayType's canonicalType should probably be struct BunchaStructs.A[].

The sketchy thing about that is that canonicalType would then be able to contain arbitrary user-defined strings. Would it be limited in length? Would it be limited to ASCII? ABITypes are normally comparable by canonicalType but here two types could be otherwise identical and not be considered equal because of different names.

It may be more sane to define the canonicalType for the ArrayType as simply struct[] or more properly in this case struct(uint256)[] and keep the arbitrary text confined to a dedicated field in StructType.

Do you think this proposal could account for all cases? What do you think about it?

Although vyper and other languages might start using internalType differently. Without rules for internalType being defined in the ABI spec itself, there is no guarantee that an incompatibility won't crop up. And there would probably not be a way to reliably differentiate JSON produced by Solidity from that produced by vyper or some other language. It might be best to simply store the information as a String and leave parsing up to the user. At least until internalType becomes more well-defined and widely adopted.

For reference (taken from abi-to-sol example BunchaStructs):

{
    "anonymous": false,
    "inputs": [
      {
        "components": [
          {
            "internalType": "uint256",
            "name": "a",
            "type": "uint256"
          }
        ],
        "indexed": false,
        "internalType": "struct BunchaStructs.A[]",
        "name": "a",
        "type": "tuple[]"
      },
      {
        "components": [
          {
            "components": [
              {
                "components": [
                  {
                    "components": [
                      {
                        "internalType": "address",
                        "name": "e",
                        "type": "address"
                      }
                    ],
                    "internalType": "struct Other.E",
                    "name": "e",
                    "type": "tuple"
                  },
                  {
                    "internalType": "string",
                    "name": "d",
                    "type": "string"
                  }
                ],
                "internalType": "struct C[]",
                "name": "c",
                "type": "tuple[]"
              }
            ],
            "internalType": "struct B",
            "name": "b",
            "type": "tuple"
          },
          {
            "components": [
              {
                "internalType": "address",
                "name": "e",
                "type": "address"
              }
            ],
            "internalType": "struct Other.E",
            "name": "e",
            "type": "tuple"
          }
        ],
        "indexed": false,
        "internalType": "struct Other.D",
        "name": "d",
        "type": "tuple"
      }
    ],
    "name": "Event",
    "type": "event"
}
esaulpaugh commented 1 year ago

I have put code related to this on a new branch https://github.com/esaulpaugh/headlong/tree/internal-type

xaviarias commented 1 year ago

The sketchy thing about that is that canonicalType would then be able to contain arbitrary user-defined strings. Would it be limited in length? Would it be limited to ASCII? ABITypes are normally comparable by canonicalType but here two types could be otherwise identical and not be considered equal because of different names.

That makes totally sense as the struct's name is not relevant in this context. I think having it shadowed under eg. struct(uint256)[] in canonicalType doesn't pose any issue and seems the good way to go, as long as the name is kept somewhere.

Do you think this proposal could account for all cases? What do you think about it?

Yes it seems reasonable, but also having compatibility in mind would be just ok to keep internalType as a string. But IMO I don't think at this point incompatibilities will crop as ABI encoder v2 has been around for a while, so I'd go for a more typed solution but it's up to you :)

Thanks for the code changes, are you planning to merge and make a release of the branch soon?

esaulpaugh commented 1 year ago

I think in ten days or so I will be able to make another release.

In the typed solution, if the internalType is contract ENS, that would seem to imply the existence of a ContractType. Which would ordinarily be declared like public final class ContractType extends ABIType<Contract>. But there is no Contract object. And if there were, what would it mean? Would it instead be correct to declare public final class ContractType extends ABIType<Address>? Or rather public final class ContractType extends AddressType.

Although, instead of a new ContractType, AddressType could work if a field were added to it to hold the name of the contract/interface.

Are there any additional internal types I'm forgetting?

xaviarias commented 1 year ago

I think in ten days or so I will be able to make another release.

👍🏼 amazing thank you !

In the typed solution, if the internalType is contract ENS, that would seem to imply the existence of a ContractType. Which would ordinarily be declared like public final class ContractType extends ABIType<Contract>. But there is no Contract object. And if there were, what would it mean? Would it instead be correct to declare public final class ContractType extends ABIType<Address>? Or rather public final class ContractType extends AddressType.

As per the ABI spec, it seems that contract is not a valid ABI type, are you thinking if obtaining it from somewhere else?

In that case maybe public final class ContractType extends UnitType<Address> would make it consistent with the class hierarchy? It also would be nice to keep AddressType final as there wouldn't be much code in a contract subclass.

Although, instead of a new ContractType, AddressType could work if a field were added to it to hold the name of the contract/interface.

Also true, so up to you for the design decisions.

Are there any additional internal types I'm forgetting?

I believe we're all good with those !

esaulpaugh commented 1 year ago

I just noticed something with the default example ABI JSON in abi-to-sol. The internal type is struct Magic but the type is tuple[]. It can't possibly be an array and a struct/tuple at the same time. Do you know if this is a mistake or what is going on here? If it were a tuple[] I would expect something like struct Magic[].

{
    "type": "event",
    "name": "Magic",
    "inputs": [
      {
        "internalType": "struct Magic",
        "name": "magic",
        "type": "tuple[]",
        "indexed": false,
        "components": [
          {
            "name": "magic",
            "type": "string"
          }
        ]
      }
    ],
    "anonymous": false
}

Is there any official documentation for internalType?

EDIT: and then it seems like user-defined value types are another can of worms that needs addressing.

xaviarias commented 1 year ago

I just noticed something with the default example ABI JSON in abi-to-sol. The internal type is struct Magic but the type is tuple[]. It can't possibly be an array and a struct/tuple at the same time. Do you know if this is a mistake or what is going on here? If it were a tuple[] I would expect something like struct Magic[].

Is it something we should be worried about or could it be an bug in the Solidity generator? Doesn't make any sense to me, and while trying to compile the generated Solidity, doesn't produce any ABI file. Would't surprise me if there's not a direct 1-to-1 conversion from ABI to sol.

Is there any official documentation for internalType?

Not apart from the one I mentioned before.

EDIT: and then it seems like user-defined value types are another can of worms that needs addressing.

What do you mean by user defined types, structs?

esaulpaugh commented 1 year ago

I mean the type aliasing feature in Solidity. UDVTs example (also taken from abi-to-sol):

{
  "inputs": [
    {
      "internalType": "Int",
      "name": "i",
      "type": "int256"
    },
    {
      "internalType": "UDVTs.Uint",
      "name": "u",
      "type": "uint256"
    },
    {
      "internalType": "Other.Bool",
      "name": "b",
      "type": "bool"
    }
  ],
  "name": "fnoo",
  "outputs": [],
  "stateMutability": "pure",
  "type": "function"
}

(set Solidity version to ^0.8.20)

xaviarias commented 1 year ago

Indeed this should also be addressed, and seems to play in favor of keeping internalType as a simple string. WDYT?

esaulpaugh commented 1 year ago

A String makes the most sense to me right now. Anything beyond that will require more careful consideration which I will have more time for after this release.

esaulpaugh commented 1 year ago

New release:

SHA-256 (headlong-9.3.0.jar): e57756aa1c9a00d1805661f69599f62f29685a80fdcd62112ed78c41ee935f51

Thanks for the suggestion. Let me know if you have any more!