Closed davidiw closed 1 year ago
The initial implementation is complete pending code review. The AIP has a new update: https://github.com/aptos-foundation/AIPs/pull/35
@davidiw since table entries entail similar global storage mechanics, is there a way to compress those into resource groups as well?
@alnoki you can already achieve somewhat similar results by wrapping table entries into a "Node" object ie:
struct Node<T: store> {
data: vector<T>
}
@alnoki Could you elaborate with examples?
@alnoki Could you elaborate with examples?
@lightmark I think simplest would be for table entries that have a fixed size. For example if all entries in a table are MyStructType
, where none of the fields are vectors, then each entry has a fixed size in bytes. Hence there could be some kind of flag on the table where the optimal number of entries are collated into a single resource group. This would be most useful for an iterable table (essentially a doubly linked list of table entries), where a linear scan over the elements would only result in global storage lookups against a resource group rather than against each entry. If each entry is 100 bytes, for example, and a resource group has 1k, then you could fit ten table entries in one resource group and only pay one per-item storage cost for a linear scan over ten elements.
@davidiw since table entries entail similar global storage mechanics, is there a way to compress those into resource groups as well?
@alnoki tables and resource groups are quite different beasts. The later are heterogenous, so you can store values of arbitrary types into the same group (enabled type-safe via Move's type-indexed memory). Tables are homogeneous collections. If you want to bucket multiple entries of the same T
into one table slot, there are different solutions which do not require the power of resource groups.
@wrwg got it, thanks for the explanation. It looks like the bucket table and ongoing B tree will be best for this approach.
@alnoki, I think there's also an opportunity to explore collections of resource groups, each of which may be the same or different. But that's a topic for ... AIP-10 and beyond :)
AIP-9 - Resource Groups (Discussion)
aip: 9 title: Resource Groups author: davidiw, wrwg, msmouse discussions-to: https://github.com/aptos-foundation/AIPs/issues/26 Status: Draft last-call-end-date: type: Standard (Interface, Framework) created: 2023/1/5 updated: N/A requires: N/A
Summary
This AIP proposes resource groups to support storing multiple distinct Move resources together into a single storage slot.
Motivation
Over the course of development, it often becomes convenient to add new fields to a resource or support an optional, heterogeneous set of resources. However, resources and structs are immutable after being published to the blockchain, hence, the only pathway to add a new field is via a new resource.
Each distinct resource within Aptos requires a storage slot. Each storage slot is a unique entry within a Merkle tree or authenticated data structure. Each proof within the authenticated data structure occupies
32 * LogN
bytes, whereN
is the total amount of storage slots. AtN = 1,000,000
, this results in a 640 byte proof.With 1,000,000 storage slots in use, adding even a new resource that contains only an event handle uses approximately 680 bytes, where the event handle requires only 40. The remaining 640 bytes comes from the new authenticated data proofs, which can be orders of magnitude larger than the data being authenticated. Beyond the capacity demands, reads and writes incur additional costs associated with proof verification and generation, respectively.
Resource groups allow for dynamic, co-location of data such that adding a new event can be done even after creation of the resource group and with a fixed storage and execution costs independent of the amount of slots in storage. In turn, this provides a convenient way to evolve data types and co-locate data from different resources.
Rationale
A resource group co-locates data into a single storage slot by encoding within the Move source files attributes that specify which resources should be combined into a single storage slot. Resource groups have no semantic effect on Move, only on the organization of storage.
At the storage layer, the resource groups are stored as a BCS-encoded BTreeMap where the key is a BCS-encoded fully qualified struct name (
address::module_name::struct_name
, e.g.,0x1::account::Account
) and the value is the BCS-encoded data associated with the resource.The above diagram illustrates data stored at address
0xcafef00d
.0x1::account::Account
is a resource stored at address0xcafef00d
.0xaa::resource::Group
contains a set of resources or a resource group stored at the same address. The resource group packs multiple resources into the group. Resources within a resource group require nested reading, wherein first the resource group must be read from storge followed by reading the specific resource from the resource group.Alternative 1 — Any within a SimpleMap
One alternative that was considered is storing data in a
SimpleMap
using theany
module. While this is a model that could be shipped without any change to Aptos-core, it incurs some drawbacks around developer and application complexity both on and off-chain. There’s no implicit caching, and therefore any read or write would require a deserialization of the object and any write would require a serialization. This means a transaction with 3 writes would result in 3 deserializations and 3 serializations. In order to get around this, the framework would need substantial, non-negligible changes, though with the emergence ofSmartMap
there may be more viability here. Finally, due to the lack of a common pattern, indexers and APIs would not be able to easily access this data.Alternative 2 — Generics
Another alternative was using templates. The challenge with using templates is that data cannot be partially read without knowing what the template type is. For example, consider an object that might be a token. In resource groups, one could easily read the
Object
or theToken
resource. In templates, one would need to read theObject<Token>
. This could also be worked around by complex framework changes and risks around partially reading BCS-encoded data, an application, which has yet to be considered. The same issues in Move would impact those using the REST API.Generalizations of Issues
There are myriad combinations between the above two approaches. In general, the drawbacks are
struct
withstore
. Astruct
withkey
ability has stricter and more understandable properties thanstore
. For example, the latter can lead to data being placed in arbitrary places, complicating global addressing and discoverability, which may be desirable for certain applications.Specification
Within the Framework
A resource group consists of several distinct resources, or a Move
struct
that has thekey
ability.Each resource group is identified by a common
Move
struct:Where this
struct
has no fields and the attribute:resource_group
. The attributeresource_group
has the parameterscope
that limits the location of other entries within the resource group:module
— only resources defined within the same module may be stored within the same resource group.address
— only resources defined within the same address may be stored within the same resource group.global
— there are no limitations to where the resource is defined, any resource can be stored within the same resource group.The motivation of using a
struct
is thatStructTag
s. Thus it limits the implementation impact to the VM and readers of storage, storage can remain agnostic to this change.struct
andfun
can have attributes, which in turn let’s us define additional parameters likescope
.Each entry in a resource group is identified by the
resource_group_member
attribute:During compilation and publishing, these attributes are checked to ensure that:
resource_group
has no abilities and no fields.scope
within theresource_group
can only become more permissive, that is it can either remain at a remain at the same level of accessibility or increase to the next.resource_group_member
attribute.group
parameter is set to a struct that is labeled as aresource_group
.struct
cannot either add or remove aresource_group_member
.The motivation for each of these requirements are:
resource_group
struct won't be used for other storage purposes. While there is no strict requirement that this be true, it is intended to mitigate confusion to developers.resource_group_member
s.resource_group_member
, there is no way for Move to know that it is within aresource_group
.Within Storage
From a storage perspective, a resource group is stored as a BCS-encoded
BTreeMap<StructTag, BCS encoded MoveValue>
, where aStructTag
is a known structure in Move of the form:{ account: Address, module_name: String, struct_name: String }
. Whereas, a typical resource is stored as aBCS encoded MoveValue
.Resource groups introduce a new storage access path:
ResourceGroup
to distinguish from existing access paths. This provides a cleaner interface and segregation of different types of storage. This becomes advantageous to indexers and other direct readers of storage that can now parse storage without inspecting module metadata. Using the example above,0x1::account::Account
is stored atAccessPath::Resource(0xcafef00d, 0x1::account::Account)
, whereas the resource group and its contents are stored atAccessPath::ResourceGroup(0xcafef00d, 0xaa::resource::Group)
The only way to tell that a resource is within a resource group is by reading the module metadata associated with the resource. After reading module metadata, the storage client should either directly read form the
AccessPath::Resource
or by first readingAccessPath::ResourceGroup
followed by deserializing theBTreeMap
and then extracting the appropriate resource.At write time, an element of a resource group must be appropriately updated into a resource group by determining the delta the resource group as a result of the write operation. This results in the handful of possibilities:
Within the Gas Schedule and the VM
The implications for the gas schedule are:
Within the Interface
The above text in storage discusses the layout for resources and resources groups. User facing interfaces, such as a REST API, should not be exposed to resource groups. It is entirely a Move concept. A direct read on a resource group should be avoided. A resource group should be flattened and included within a set of resources when reading bulk resources at an address.
Reference Implementation
https://github.com/aptos-labs/aptos-core/pull/6040
Risks and Drawbacks
StructTag
(likely much less than 100 bytes). Accesses to a resource group will incur an extra deserialization for reads and an extra deserialization and serialization for writes. This is cheaper than alternatives and still substantially cheaper than storage costs. Of course, developers are free to explore the delta in their own implementations as resource groups does not eliminate individual resources.None of these are major roadblocks and will be addressed as part of the implementation of Resource Groups.
Future Potential
While resources cannot seamlessly adopted into resource groups, it is likely that many of the commonly used resources are migrated into new resources within resource groups to give more flexibility to upgradeability, because a resource group does not lock developers into a fixed resource layout. In fact, this returns Aptos back to supporting a more idiomatic Move, which co-locates resources stored at an address — being freed from perf considerations which hindered developers before.
Another area worth investigating is whether or not a templated struct can be within a resource group depending on what the generic type is. Consider the current Aptos
Account
and theCoinStore<AptosCoin>
. Storing them separately has negative impact on performance and storage costs.In the current VM implementation, resources are cached upon read. This can be improved with caching of the entire resource group at read time.
Suggested implementation timeline
References