[move-stdlib] add struct decomposition via new `struct_tag` module

bytedeveloperr commented 1 year ago

Motivation

This adds introspection decomposition of struct-types into Move natively.

While working with Move, we've found the need to pull out pieces of a struct's type (such as the package-id / account-address of where the declaring module is published); we've built such solutions using std::type_name, but this is involves parsing ascii strings, which is costly (from a gas perspective).

To make this process easier, we're introducing a new native module called struct_tag, which creates a StructTag type, which has all of the components of a struct's type (package-id, module-name, struct-name, and generics) already broken down into easy-to-use pieces.

For primitive types, such as vector, attempting to get a StructTag will abort, because primitive types do not have the same structure as struct-types. We thought this solution was better than creating a struct-tag with most of its fields empty. We're open to revise this decision however.

    struct StructTag has copy, store, drop {
        address_: address,
        module_name: String,
        struct_name: String,
        generics: vector<String>
    }

Also, for more information about this please see https://github.com/move-language/move/issues/966

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

after switching current dir to move/language/move-stdlib, run

cargo run -p df-cli -- test

PaulFidika commented 1 year ago

Note that for structtag.address, we could also go with other names, like 'addr', 'package', or 'account'. Aptos and Sui both have different naming conventions (Aptos calls them account-addresses whereas Sui calls them package-ids), and we wanted to be inclusive of both.

Additionally for generics, we went with vector<String>, but we could go with vector<TypeName> if people like that better?

Suficio commented 1 year ago

We've been using this pattern to authorize getting a mutable reference to an object from a type-erased container which should only be allowed in the context of the module defining said object. This PR would be very helpful to achieve that.

Here is a link to our documentation which should put it in context: https://origin-byte.github.io/nft.html#borrow_domain_mut

sblackshear commented 1 year ago

Apologies for being slow in replying to this. I left some comments that I'll reproduce here for convenience

Quick thoughts:

We did think about something similar to this when building type_name, but wanted to bias toward reflection API's that encourage simple, easy-to-reason about usage (e.g., comparison of typenames). We also wanted a representation that would work for both structs and primitive types--felt a bit uncomfortable with patterns like introducing "dummy struct tags" for primitive types

Clearly, the existence of type_name::into_string makes it possible to go deeper down the reflection rabbit hole, even if it's (somewhat intentionally) inconvenient

I'm not necessarily opposed to something like an in-Move representation of StructTag, but I'd like to understand what sort of operations you'd like to do on a StructTag (if there are only a few) and see if we can implement those more directly via native functions on TypeName if so. If that's possible, I think it will make it much easier for programmers and static analysis tools to reason about our reflection APIs

@PaulFidika replied to the third point with some specific use-cases that make a lot of sense to support, but I think we can do so with a reflection API that is slightly less powerful, or at least uses the big hammer of converting types into strings only when it's needed. Keeping the artifacts we reflect over as opaque as possible will make life easier for static reasoning tools like the Prover--when we go into string-land, things get a lot trickier for such tools.

The two use-cases Paul suggested (also in the thread above) are:

Getting the package name for a type
Deriving a unique, fixed-length identifier for a type (e.g., by hashing it and converting the result into an address)

What do you think about adding functions specifically for these generic/helpful use-cases (e.g., get_package<T>(): Option<MoveIdentifier> and get_unique_id<T>(): u256) rather than a broader API that lets folks hand-roll these? Definitely open adding others if we are missing other critical use-cases, or going with the approach in this PR if there are so many use-cases that the general library is the way to go. But reflection is very hard for static reasoning, and amenability to static reasoning is a key part of what makes it easy to build awesome tolling for Move + keep it safe, so I'm hesitant to go too far down that road if we can avoid it.

move-language / move