Closed joshuahannan closed 2 years ago
I can solve this problem if you have the patience to walk me through some basic knowledge necessary to understand what all the moving parts are. Questions I have: 1) Why has the metadata associated with NFT's always been stored off-chain? 2) What are current blockers that create this issue? 3) In a perfect system where everything was possible, how would the metadata of an NFT be stored? 4) Is it possible for a User to store their own metadata for their owned NFTs? Why or why not? 5) What has worked so far in existing projects that has moved us closer to solving this issue and where did they leave off and why could they not completely accomplish this? If you answer these, my next response might also be a wave of questions since I am pretty new to understanding data flow/storage, but my problem solving skills in general are solid as long as I understand the entire picture/pieces. Cheers, WK
@MiyaSteven Sorry for the late response. I can answer a few of your questions, but I'll need to find some more help answering some other ones, as I am not super familiar with the state of metadata storage in other blockchains
Thanks for reaching out and feel free to ask more questions if you still need help!
I'm of the opinion that metadata should be flexible and scalable. You don't get any of those things with current on-chain ethereum solutions.
Metadata should be flexible As a creator of an NFT I should be able to update the metadata as I see fit to support my project without having to rewrite my smart contract to change the shape of my data, or submit a transaction to update the data.
Metadata should be scalable This ties into flexibility as well, but metadata needs to be scalable. As NFTs become more mainstream and projects become larger not only does metadata need to be flexible and upgradeable, but this flexibility needs to scale. If my project has 10 million NFTs it doesn't make financial sense to have to update 10million NFTs on chain (my experience is on Ethereum).
My platforms solution for this is to use a traditional storage space and have a server that dynamically retrieves metadata based on the contract address and token ID. This allows for metadata to be updated at scale to provide flexibility for enterprise projects.
If this storage space was decentralized and there could be a history of metadata updates I think that would suffice. Updating metadata needs to be cheap, fast and easy.
Could a start to this be to support labels that are pure {String,String} Dictionary and expose that in the public interface and collection interface. Just doing that would make it a lot easier to experiment with.
Adding support for migrating the data in the schema to a new format would be a very nice feature here. A binary format like avro supports this. There are lots of examples on how people use it in kafka to allow sending messages that are backwards compatible with old formats.
What is the size of a flow block gonna be? Will it be feasible to store a pretty large SVG on-chain.
We originally had a {String: String}
dictionary in the NFT standard, but we felt that is wasn't necessary since we wanted the standard to be pretty minimal and didn't want to force a relatively weak standard for metadata on the users of the standard. But you can experiment with that in your own contracts that implement the NFT standard. That is similar to what top shot does.
Yes, I totally agree that the schema should be able to be migrated. We are discussing a process for how contracts are stored and upgraded in accounts in https://github.com/onflow/cadence/issues/221#issuecomment-663851208, which includes a part about how data from contracts is migrated to the new version. We'd love some feedback or ideas if you have any.
We aren't totally clear on what the size of a flow block will be but I feel fairly confident that you'll be able to store larger files like that as long as you have the money to pay for storage.
Another thing me and @psiemens talked about yesterday was implicit support for something like IPVS.
When sending a transaction with an specific type that field is not sent on chain but sent into IPVS, and if the field is accessed it is pullef from there. Or something like that.
- Why has the metadata associated with NFT's always been stored off-chain?
Cost per GB of storage on Ethereum is measured in the millions of dollars. For comparison, price per GB of storage on cloud services like AWS is $0.05/month. Storage architecture must be radically different on Flow to make this idea even remotely cost-effective. Currently, it's not so hard to fire up an IPFS pinning service, and store only the hash on-chain (which is a URL in IPFS-land).
- What are current blockers that create this issue?
I don't know the Flow architecture well enough to answer this question.
- In a perfect system where everything was possible, how would the metadata of an NFT be stored?
Different NFTs can represent radically different information, needing radically different data structures. There is no one-size-fits-all solution for NFT data representation.
- Is it possible for a User to store their own metadata for their owned NFTs? Why or why not?
That would effectively shard Flow storage. How would you achieve consensus on data representation?
- What has worked so far in existing projects that has moved us closer to solving this issue and where did they leave off and why could they not completely accomplish this?
Store real data on IPFS. Encode only the hash on the blockchain. The primary blocker for this on Ethereum is the cost of storing data on Ethereum ($millions/GB).
An idea would be to use bson
(http://bsonspec.org) , and store metadata on chain.
Edit: Instead of imposing a predefined schema, we would let the schema to be decided by the developer but add a easy way to filter items (NFTs) in a collection based on the fields the metadata would contain.
metadata
field of type bson
.Reference for querying bson data in Postgres https://www.postgresql.org/docs/9.6/datatype-json.html
Example:
metadata = {
"firstName": "Kevin",
"lastName": "Durant",
"season": 2019,
"team": {
"name": "Nets",
"primary_position": "Small forward"
}
}
Query examples
Filter by presence of a key.
metadata ? "firstName"
Response: all items in the collection that have the key firstName
set.
metadata ?| ["firstName", "season"]
Response: all items in the collection that have one of the keys firstName
OR season
set. Hence the |
.
metadata ?& ["firstName", "season"]
Response: all items in the collection that have both keys firstName
ANDseason
set. Hence the &
.
Filer by checking inclusion of one JSON into another one.
metadata -> team @> {"name": "Nets"}
Response: all items in the collection that have the team name set to Nets
.
metadata -> team @> {"name": "Nets", "primary_position": "Small forward"}
Response: all items in the collection that have the team name set to Nets
and the team position set to Small forward
Field names If needed, the on-chain metadata could have short names for keys to save on storage. The developer could make available the desired mapping on his website. The mapping would be used by other dapps and wallets that want to interact with NFTs created by the developer and need to display that info to the end user.
<head>
<meta name="flow-0x01.NBATopShot" content="https://my-dapp.com/0x01.NBATopShot.json">
</head>
0x01.NBATopShot.json could look like this:
"en": {
"firstName": "Player First Name",
"lastName": "Player Last Name",
"season": "NBA Season",
"team": {
"name": "Team Name",
"primary_position": "Player primary postion"
}
}
Not sure if there should be a metadata standard. Perhaps just a metadata field that is a custom Metadata resource per NFT. Then any client can traverse this resource to get relavent metadata. The contract can also define a metadata template field for recommendations on displaying the data using handlebars or similar; although, these metadata display templates are probably better suited as scripts rather than contract functions.
Tldr; seems metadata can be stored however the dev wants, but have a standard for the metadata script used to read and format the data.
And building standards for all complex data types will be beneficial. Although, you can jam almost anything into a string ;)
Many schemas that may be suitable for NFTs have already been defined and are available at schema.org including images, video, audio, and so on.
ERC-721 specifies a simple metadata schema that looks like this:
{
"title": "Asset Metadata",
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Identifies the asset to which this NFT represents",
},
"description": {
"type": "string",
"description": "Describes the asset to which this NFT represents",
},
"image": {
"type": "string",
"description": "A URI pointing to a resource with mime type image/* representing the asset to which this NFT represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive.",
}
}
}
ERC-721 tokens get associated with their metadata on-chain using a tokenURI field in the token contract. Further information such as description and image are parsed from the corresponding JSON record referenced by URI. (See above). That URI generally references a JSON document and assets on IPFS.
Open questions:
Using standard schemas for different types of assets from schema and specifying these from the start is a great idea. Case in point - the ERC-721 spec on Ethereum has a limited metadata schema definition and now different platforms have different metadata shapes for different media types which ruins the off-the-shelf interoperability between these platforms.
@pizzarob Yep. I just added the ERC-721 metadata schema to my comment, above. I agree that mappability to ERC-721 is an important feature to strive for.
Just catching up on this issue. RE: Joshua's June 16th's comment
Questions:
I make a little something to see if Mixins could be used to solve this. https://github.com/bjartek/flow-nft-mixin
@bjartek I think that's an interesting idea to add additional features to the NFT, such as artist royalties. I think smart contract composability is a great feature.
Are smart contract mixins idiomatic on Flow? I'd love to read more about them with more descriptions of how they work and more examples of using them.
Mixin might not be the correct term to use here. But interface, contract, capability are already used so it just used mixin.
My refence for it is https://docs.scala-lang.org/tour/mixin-class-composition.html
Would Trait be better?
It might be better for Trait just to be an interface that requires some methods. Like
Aso. Then you do not need for it to own a resouce since it can be the resource itself.
Here are a series of observations, opinions, and facts that will hopefully help this discussion:
Are smart contract mixins idiomatic on Flow? As I pointed out above, standard interfaces (like the Non-Fungible Token interface) define the minimum functionality for compliant implementations, and don't limit additional functionality. In particular, additional standards or optional extensions to existing standards can be published on chain, and any implementation can choose to conform to as many of them as it wants, provided they don't conflict.
To use your example for royalties, you could define an extension of the NFT interface that includes royalty tracking, and (so long as it doesn't break the requirements of the base NFT interface), would be seen as a generic NFT to use cases that didn't have to worry about royalties, while adding in the royalty functionality where appropriate.
Are interfaces on Flow first-class and composable? For example, can we say a contract implements the NFT, Royalties, and timelock interface?
Are interfaces on Flow first-class and composable? For example, can we say a contract implements the NFT, Royalties, and timelock interface?
You bet! Provided those interfaces don't have any conflicts.
struct and resource interfaces are composable, but contract interfaces aren't, correct? @turbolent ?
If you mean "composable" as in, a concrete struct/resource/contract can implement multiple interfaces, then yes, all of them can do that (if it's not working right now it is a bug), e.g.
contract interface NFT { /* ... */ }
contract interface Royalties { /* ... */ }
contract interface Timelock { /* ... */ }
contract Cool: NFT, Royalties, Timelock { /* ... */ }
But it is not possible to import the code for fulfulling that interface contract from another account right? You will have to duplicate that yourself?
That's right, @bjartek. We have some thoughts on how to implement code-reuse, but they are still only thoughts.
We definitely didn't want to use standard object inheritance, because it already has lots of weird edge cases in off-chain code, and we were pretty sure that it would be an absolute disaster in the context of smart contracts. Imagine if I could define a sub-class of CryptoKitty that overrode the breeding method! Or a subclass of Vault that "extended" the interface to include a "makeMeRich" method! Doesn't exactly fit the use case... 😀
@dete exactly. That was part of the reasoning behind my mixin experiment. Each trait would store all its method and state in a seperate «namespace» inside the NFT.
Will this feature be done before main net launches? Or if not how should we structure our NFT to be compatible.
In my case I want to store Art that could be unique or editioned. Either as a single «trait» or ad two.
Are there any existing examples of royalty tracking we can look at?
I'm part of a team working on an open marketplace on Flow and we would need to deal with metadata rather soon 😄 .
After reading your comments, and in order to keep it simple and flexible for the first NFTs created and aggregated, my proposition that is in fact a sum of your propositions is:
schema.org
and JSON-LD
to define the metadata structure and contents.Examples on how to format data using
schema.org
andJSON-LD
are available here: https://developers.google.com/search/docs/data-types/book#structured-data-type-definitions List of all the available schemas: https://schema.org/docs/schemas.html
on-chain
an off off-chain
storage methods are accepted.For metadata that is stored off-chain
:
SHA2_256
hash of the file contents.🔢
{metadata: "https:\/\/s3.amazonaws.com\/your-bucket\/your-folder\/{file-hash}.json"
💡 Metadata translations can be managed using the"workTranslation":
attribute inside the metadata file.
For metadata that is stored on-chain
Deciding between on-chain and off-chain storage A rough estimation of storage pricing is that 1KB of data would cost at least $0.01 to store forever.
Calculation method: Flow account creation costs
0.1 FLOW
and comes with1KB
of storage FLOW token price during the community sale (this is the current minimum): $0.1
This is how 460B of metadata would look like:
{"@context":"https://schema.org","@type":"SportsTeam","name":"Seattle Seahawks","sport":"American Football","memberOf":[{"@type":"SportsOrganization","name":"National Football League"},{"@type":"SportsOrganization","name":"National Football Conference"},{"@type":"SportsOrganization","name":"NFC West Division"}],"coach":{"@type":"Person","name":"Pete Carroll"},"athlete":[{"@type":"Person","name":"Russell Wilson"},{"@type":"Person","name":"Marshawn Lynch"}]}
On-chain
Storing this metadata on-chain would cost $ 0.0046
.
metadata: {"@context":"https://....."}
Off-chain
Storing the same metadata off-chain, and only the reference on chain would cost $ 0.00125
as the data weights 125B.
In this case, you need to keep in mind that you will have additional charges from your cloud storage provider.
metadata: "https:\/\/s3.amazonaws.com\/your-bucket\/your-folder\/94fb72c123f82db7b19a23dceb0bf60f7c1fdfa8726b53379c0dcfa63e3b8b3c.json"
Using a file hash is optional but encouraged.
94fb72c123f82db7b19a23dceb0bf60f7c1fdfa8726b53379c0dcfa63e3b8b3c
is SHA2_256({"@context":"https://..."})
I modified my mixing/trait example from above to model a Trait as a Interface. https://github.com/bjartek/flow-nft-mixin/tree/trait
I think using some established standard for structured data such as schema.org is a great idea!
It might be a good idea to implement the schemas as types to increase type-safety, instead of encoding the data in the JSON data model, i.e. as dictionaries, arrays, strings, etc.
I think that sounds like a really sound idea @turbolent, to use the type system for what it is worth.
@turbolent @bjartek What's the difference between using a schema like schema.org vs using types? Is a schema not essentially a description of a complex type?
@ericelliott the difference is where they are enforced. A json-ld string representing schema.org data cannot be validated easily onchain. A Cadence struct representing the same data can.
I'm a fan of Dublin Core and its descendants for metadata -
https://en.wikipedia.org/wiki/Dublin_Core
For royalty tracking there are the various rights expression languages -
https://en.wikipedia.org/wiki/Rights_Expression_Language
I'm a fan of ccREL (having worked for CC... ;-) ) -
https://en.wikipedia.org/wiki/Creative_Commons_Rights_Expression_Language
These are all much older than schema.org, and schema.org is seeing usage outside of web sites now which was the main differentiator for these standards (they are used in asset management workflows, etc.).
But this is all much grander than a simple core of metadata attributes.
@robmyers I'm interested in developing a cross-chain standard for representing NFT metadata so we can easily port our NFTs between chains. IMO, anything that relies specifically on the Cadence language should be considered a compile/mapping target, not a candidate for an NFT metadata specification. I recognize that this discussion is about how we should represent NFT metadata specifically on Flow, but IMO, we should start from the point-of-view of cross-chain interoperability and composability and take advantage of Flow and Cadence-specific features only after we've decided on a cross-chain representation for metadata. E.g., how do we express these ideas in a document that can be read by many languages?
RE: RDF & ccREL - most of the RDF and Dublin Core documentation is focused on expressing metadata in HTML or XML. We'd need to translate vocabulary and examples to use a more modern and accessible format, such as JSON - but before we dive in deep and start on that work, there is a lot of overlap between ccREL and Schema.org. Are there specific differences you want to point out? AFAIK, ccREL is very simple, exposing just a handful of vocabulary words related to expressing rights, which could pretty easily be adopted into a new metadata standard for NFTs.
AFAIK, the vocabulary of ccREL is actually a bit too limited, as there's no easy way to express multiple collaborators or royalty splits with ccREL - you get one attribution slot and one attribution URL. I suppose you could put a collection of collaborators at the dereferenced URL, but how to do that or interpret that is not included in ccREL. Please let me know if I got that wrong. I have been using CC since its inception, but I'm not an expert on the metadata specs.
http://internft.org/ is working on something similar I believe but I haven't had a chance to look yet.
For on-chain data, Cadence data structures are the neutral format. RDF (and schema.org schema data can be encoded as RDF) has multiple representations. These older standards are more rigorous and them being more minimal may be better for considering the requirements of an initial metadata standard on-chain that nonetheless goes further than ERC721 and clearly indicates how it might be extended.
I'd also look at PROV and MPEG 21 REL.
To be clear, I'm not suggesting that any of these be adopted as drop-in standards. :-)
Having a Schema.org with built-in types would great.
As @robmyers mentioned, avoiding human errors and doing easy validation is a nice advantage. A second advantage could be that as on Flow a resource can own other resources, having the metadata typed could make it easier for multiple Dapps to interoperate as relationships could be built between resources based on the metadata.
I took a quick look at http://internft.org , apart from the first pages there is not much to see, the project looks abandoned.
Dublin Core it's nice and succinct, but that's the downside too, I don't think it has enough vocabulary to power the metadata for a virtual world like Decentraland so it might not be adopted by serious Dapps which would be a pity.
Other blockchains could adopt Schema.org for metadata interop with Flow. It's neutral ground and not specific to Flow or Cadence itself. They could build their own internal types to match the schema specs if needed.
More information on schema evolution https://www.inkandswitch.com/cambria.html
I'm learning about Salesforce Multi Tenant Architecture and it made me think of this. I'm assuming this is what has been mentioned before by building relationships to access metadata. The only difference is the fact that these are NFT's but it's similar to this I think...https://www.youtube.com/watch?v=Tuy_O37H3O8
Have you considered using DIDs? Part of the standard is they must point to a DID Document, which allows for all kinds of structured and auto-discoverable stuff, including ownership private keys, schema.org json schemas, content source links etc.
Flow has a Collector Role, Consensus Role, Execution Role, and Verification Role, and much of Flows strengths are derived from separating concerns among those roles. What if there was a metadata storage role that handled the concern of file storage? The concern of this type of node would be provisioning cheap disk space (and appropriate bandwidth).
The analogous solution I'm thinking of is CDNs. A file is uploaded and distributed across the CDNs network of data centers in different geographical regions. A single URL is assigned to the file that can be used to retrieve it. When file requests are made the CDN routes requests to the "nearest" datacenter to the requestor and provisions the response from that location.
What if the problem isn't to find a way to embed metadata into Flow accounts directly but rather persist it to a CDN like Role Node and have the accounts store references to metadata on those CDN nodes, not unlike how git LFS works.
The benefits of this approach would be the ability to incentivize metadata/storage role nodes economically (like the other roles) and make metadata storage a native element of the platform that could be exposed though a consistent interface native to cadence.
IPVS
You mean IPFS?
A schema will just add bloat like SOAP. Let the developers of each project decide how to interpret their metadata. If I issue a contract then I know how to interpret the contract. A version number could be handy but that can be incorporated into the contract itself. JSON is perfect for metadata.
@victor-geere the whole point of this is to allows others to use the metadata you set. Wallets, blockchain explorers and other Dapps.
Have you had a look at the NFT schema editor on https://wax.atomichub.io/creator?
NFT schema editor on
This is blocked content on my end. Can you paste a screenshot?
@DanMacDonald makes a great suggestion. If we could give smart contracts on flow first-class access to that data without oracles, that would be a significant advantage over "just use IPFS", and no, IPFS is not an adequate solution to use in place of a CDN. IPFS needs CDNs on top to make it adequately performant.
See also: Filecoin, Storj/Tardigrade, Theta Network, SIA. If a blockchain node could read data from a storage layer and use it in smart contracts, that would be amazing. Hashes can be verified for security, but we'd need be sure the hash functions are resistant to preimage attacks.
1.We must standardize the most basic metadata fields in order to implement cross-chains with other chains in the future 2.Now that there are storage chains, we should store data such as pictures on the chain, and we can make which chains will be recognized, such as IPFS,AR(we can talk about that) 3.If the second recommendation is denied,the metadata will be outside the chain, so we should index the data, and the index should be stored in the flow chain 4.We should also have a specification for the name of the function that reads metadata
Have you considered using DIDs? Part of the standard is they must point to a DID Document, which allows for all kinds of structured and auto-discoverable stuff, including ownership private keys, schema.org json schemas, content source links etc.
https://ceramic.network/ They have a good practice about DIDs..
Issue To Be Solved
NFTs always have some sort of metadata associated with them. Historically, most of that metadata has been stored off-chain, but we would like to create a standard for metadata that allows all metadata to be stored on-chain so everything about the NFTs is truly decentralized.
This issue is meant for discussion about the possibilities of the solution. More documentation and examples will be added as we research and discuss more.
I am currently leading the charge on this, but I have a lot on my plate and don't know if I can give this issue the love it deserves, so if someone from the community wants to lead, I would love to speak with you!
Suggest A Solution
Context
The Avastars project is an interesting project on Ethereum that that we could potentially take inspiration from for our metadata.