onflow / flow-nft

The non-fungible token standard on the Flow blockchain
https://onflow.org
The Unlicense
464 stars 170 forks source link

IMPROVEMENT: NFT Metadata #9

Closed joshuahannan closed 2 years ago

joshuahannan commented 4 years ago

Issue To Be Solved

NFTs always have some sort of metadata associated with them. Historically, most of that metadata has been stored off-chain, but we would like to create a standard for metadata that allows all metadata to be stored on-chain so everything about the NFTs is truly decentralized.

This issue is meant for discussion about the possibilities of the solution. More documentation and examples will be added as we research and discuss more.

I am currently leading the charge on this, but I have a lot on my plate and don't know if I can give this issue the love it deserves, so if someone from the community wants to lead, I would love to speak with you!

Suggest A Solution

Context

The Avastars project is an interesting project on Ethereum that that we could potentially take inspiration from for our metadata.

MiyaSteven commented 4 years ago

I can solve this problem if you have the patience to walk me through some basic knowledge necessary to understand what all the moving parts are. Questions I have: 1) Why has the metadata associated with NFT's always been stored off-chain? 2) What are current blockers that create this issue? 3) In a perfect system where everything was possible, how would the metadata of an NFT be stored? 4) Is it possible for a User to store their own metadata for their owned NFTs? Why or why not? 5) What has worked so far in existing projects that has moved us closer to solving this issue and where did they leave off and why could they not completely accomplish this? If you answer these, my next response might also be a wave of questions since I am pretty new to understanding data flow/storage, but my problem solving skills in general are solid as long as I understand the entire picture/pieces. Cheers, WK

joshuahannan commented 4 years ago

@MiyaSteven Sorry for the late response. I can answer a few of your questions, but I'll need to find some more help answering some other ones, as I am not super familiar with the state of metadata storage in other blockchains

  1. Metadata associated with NFTs has been stored off-chain because on chain storage is usually expensive and hard to manage. Gas fees for storing on ethereum were high so the common recommendation was to only store the things that absolutely needed to be stored. Also, Solidity didn't provide built in libraries to manage images, videos, json, and other complex file types so apps decided to just keep most of this in their own off-chain database
  2. Similar to my last answer, the main blockers are that Cadence does not provide an easy way to manage json and other complex files, so we need to either build it into to the core of the language, or we need to write functions in our smart contracts to manage these.
  3. This is a hard question to answer, but I think the consensus would be to have a customizable standard for storing self-describing json files in resources that can hold almost any type of metadata, such as numbers, text, images, videos. The contract would be able to parse some of these and return their values to the caller.
  4. It is definitely possible to store the metadata for an NFT in the NFT itself, because you could just add a field or fields to the NFT that holds the metadata in an encoded format.
  5. Other projects on Ethereum have been trying to solve this issue and are seeing decent success. Some examples I know of are Avastars and Chainfaces so I would recommend checking those out.

Thanks for reaching out and feel free to ask more questions if you still need help!

pizzarob commented 4 years ago

I'm of the opinion that metadata should be flexible and scalable. You don't get any of those things with current on-chain ethereum solutions.

Metadata should be flexible As a creator of an NFT I should be able to update the metadata as I see fit to support my project without having to rewrite my smart contract to change the shape of my data, or submit a transaction to update the data.

Metadata should be scalable This ties into flexibility as well, but metadata needs to be scalable. As NFTs become more mainstream and projects become larger not only does metadata need to be flexible and upgradeable, but this flexibility needs to scale. If my project has 10 million NFTs it doesn't make financial sense to have to update 10million NFTs on chain (my experience is on Ethereum).

My platforms solution for this is to use a traditional storage space and have a server that dynamically retrieves metadata based on the contract address and token ID. This allows for metadata to be updated at scale to provide flexibility for enterprise projects.

If this storage space was decentralized and there could be a history of metadata updates I think that would suffice. Updating metadata needs to be cheap, fast and easy.

bjartek commented 3 years ago

Could a start to this be to support labels that are pure {String,String} Dictionary and expose that in the public interface and collection interface. Just doing that would make it a lot easier to experiment with.

Metadata should have a schema that could be migrated

Adding support for migrating the data in the schema to a new format would be a very nice feature here. A binary format like avro supports this. There are lots of examples on how people use it in kafka to allow sending messages that are backwards compatible with old formats.

What is the size of a flow block gonna be? Will it be feasible to store a pretty large SVG on-chain.

joshuahannan commented 3 years ago

We originally had a {String: String} dictionary in the NFT standard, but we felt that is wasn't necessary since we wanted the standard to be pretty minimal and didn't want to force a relatively weak standard for metadata on the users of the standard. But you can experiment with that in your own contracts that implement the NFT standard. That is similar to what top shot does.

Yes, I totally agree that the schema should be able to be migrated. We are discussing a process for how contracts are stored and upgraded in accounts in https://github.com/onflow/cadence/issues/221#issuecomment-663851208, which includes a part about how data from contracts is migrated to the new version. We'd love some feedback or ideas if you have any.

We aren't totally clear on what the size of a flow block will be but I feel fairly confident that you'll be able to store larger files like that as long as you have the money to pay for storage.

bjartek commented 3 years ago

Another thing me and @psiemens talked about yesterday was implicit support for something like IPVS.

When sending a transaction with an specific type that field is not sent on chain but sent into IPVS, and if the field is accessed it is pullef from there. Or something like that.

ericelliott commented 3 years ago
  1. Why has the metadata associated with NFT's always been stored off-chain?

Cost per GB of storage on Ethereum is measured in the millions of dollars. For comparison, price per GB of storage on cloud services like AWS is $0.05/month. Storage architecture must be radically different on Flow to make this idea even remotely cost-effective. Currently, it's not so hard to fire up an IPFS pinning service, and store only the hash on-chain (which is a URL in IPFS-land).

  1. What are current blockers that create this issue?

I don't know the Flow architecture well enough to answer this question.

  1. In a perfect system where everything was possible, how would the metadata of an NFT be stored?

Different NFTs can represent radically different information, needing radically different data structures. There is no one-size-fits-all solution for NFT data representation.

  1. Is it possible for a User to store their own metadata for their owned NFTs? Why or why not?

That would effectively shard Flow storage. How would you achieve consensus on data representation?

  1. What has worked so far in existing projects that has moved us closer to solving this issue and where did they leave off and why could they not completely accomplish this?

Store real data on IPFS. Encode only the hash on the blockchain. The primary blocker for this on Ethereum is the cost of storing data on Ethereum ($millions/GB).

cybercent commented 3 years ago

An idea would be to use bson (http://bsonspec.org) , and store metadata on chain.

Edit: Instead of imposing a predefined schema, we would let the schema to be decided by the developer but add a easy way to filter items (NFTs) in a collection based on the fields the metadata would contain.

  1. The NFT standard would have a metadata field of type bson.
  2. Cadence would provide methods to filter items in a collection based on metadata.

Reference for querying bson data in Postgres https://www.postgresql.org/docs/9.6/datatype-json.html

Example:

metadata = {
 "firstName": "Kevin",
 "lastName": "Durant",
  "season": 2019,
  "team": {
       "name": "Nets", 
       "primary_position": "Small forward"
   }
}

Query examples

Filter by presence of a key. metadata ? "firstName" Response: all items in the collection that have the key firstName set.

metadata ?| ["firstName", "season"] Response: all items in the collection that have one of the keys firstName OR season set. Hence the |.

metadata ?& ["firstName", "season"] Response: all items in the collection that have both keys firstName ANDseason set. Hence the &.

Filer by checking inclusion of one JSON into another one. metadata -> team @> {"name": "Nets"} Response: all items in the collection that have the team name set to Nets.

metadata -> team @> {"name": "Nets", "primary_position": "Small forward"} Response: all items in the collection that have the team name set to Nets and the team position set to Small forward

Field names If needed, the on-chain metadata could have short names for keys to save on storage. The developer could make available the desired mapping on his website. The mapping would be used by other dapps and wallets that want to interact with NFTs created by the developer and need to display that info to the end user.

<head>
<meta name="flow-0x01.NBATopShot" content="https://my-dapp.com/0x01.NBATopShot.json">
</head>

0x01.NBATopShot.json could look like this:

"en": {
 "firstName": "Player First Name",
 "lastName": "Player Last Name",
  "season": "NBA Season",
  "team": {
       "name": "Team Name", 
       "primary_position": "Player primary postion"
   }
}
alxocity commented 3 years ago

Not sure if there should be a metadata standard. Perhaps just a metadata field that is a custom Metadata resource per NFT. Then any client can traverse this resource to get relavent metadata. The contract can also define a metadata template field for recommendations on displaying the data using handlebars or similar; although, these metadata display templates are probably better suited as scripts rather than contract functions.

Tldr; seems metadata can be stored however the dev wants, but have a standard for the metadata script used to read and format the data.

alxocity commented 3 years ago

And building standards for all complex data types will be beneficial. Although, you can jam almost anything into a string ;)

ericelliott commented 3 years ago

Many schemas that may be suitable for NFTs have already been defined and are available at schema.org including images, video, audio, and so on.

ERC-721 specifies a simple metadata schema that looks like this:

{
    "title": "Asset Metadata",
    "type": "object",
    "properties": {
        "name": {
            "type": "string",
            "description": "Identifies the asset to which this NFT represents",
        },
        "description": {
            "type": "string",
            "description": "Describes the asset to which this NFT represents",
        },
        "image": {
            "type": "string",
            "description": "A URI pointing to a resource with mime type image/* representing the asset to which this NFT represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive.",
        }
    }
}

ERC-721 tokens get associated with their metadata on-chain using a tokenURI field in the token contract. Further information such as description and image are parsed from the corresponding JSON record referenced by URI. (See above). That URI generally references a JSON document and assets on IPFS.

Open questions:

pizzarob commented 3 years ago

Using standard schemas for different types of assets from schema and specifying these from the start is a great idea. Case in point - the ERC-721 spec on Ethereum has a limited metadata schema definition and now different platforms have different metadata shapes for different media types which ruins the off-the-shelf interoperability between these platforms.

ericelliott commented 3 years ago

@pizzarob Yep. I just added the ERC-721 metadata schema to my comment, above. I agree that mappability to ERC-721 is an important feature to strive for.

MiyaSteven commented 3 years ago

Just catching up on this issue. RE: Joshua's June 16th's comment

Questions:

bjartek commented 3 years ago

I make a little something to see if Mixins could be used to solve this. https://github.com/bjartek/flow-nft-mixin

ericelliott commented 3 years ago

@bjartek I think that's an interesting idea to add additional features to the NFT, such as artist royalties. I think smart contract composability is a great feature.

Are smart contract mixins idiomatic on Flow? I'd love to read more about them with more descriptions of how they work and more examples of using them.

bjartek commented 3 years ago

Mixin might not be the correct term to use here. But interface, contract, capability are already used so it just used mixin.

My refence for it is https://docs.scala-lang.org/tour/mixin-class-composition.html

Would Trait be better?

bjartek commented 3 years ago

It might be better for Trait just to be an interface that requires some methods. Like

Aso. Then you do not need for it to own a resouce since it can be the resource itself.

dete commented 3 years ago

Here are a series of observations, opinions, and facts that will hopefully help this discussion:

dete commented 3 years ago

Are smart contract mixins idiomatic on Flow? As I pointed out above, standard interfaces (like the Non-Fungible Token interface) define the minimum functionality for compliant implementations, and don't limit additional functionality. In particular, additional standards or optional extensions to existing standards can be published on chain, and any implementation can choose to conform to as many of them as it wants, provided they don't conflict.

To use your example for royalties, you could define an extension of the NFT interface that includes royalty tracking, and (so long as it doesn't break the requirements of the base NFT interface), would be seen as a generic NFT to use cases that didn't have to worry about royalties, while adding in the royalty functionality where appropriate.

ericelliott commented 3 years ago

Are interfaces on Flow first-class and composable? For example, can we say a contract implements the NFT, Royalties, and timelock interface?

dete commented 3 years ago

Are interfaces on Flow first-class and composable? For example, can we say a contract implements the NFT, Royalties, and timelock interface?

You bet! Provided those interfaces don't have any conflicts.

joshuahannan commented 3 years ago

struct and resource interfaces are composable, but contract interfaces aren't, correct? @turbolent ?

turbolent commented 3 years ago

If you mean "composable" as in, a concrete struct/resource/contract can implement multiple interfaces, then yes, all of them can do that (if it's not working right now it is a bug), e.g.

contract interface NFT { /* ... */ }

contract interface Royalties { /* ... */  }

contract interface Timelock { /* ... */  }

contract Cool: NFT, Royalties, Timelock { /* ... */ }
bjartek commented 3 years ago

But it is not possible to import the code for fulfulling that interface contract from another account right? You will have to duplicate that yourself?

dete commented 3 years ago

That's right, @bjartek. We have some thoughts on how to implement code-reuse, but they are still only thoughts.

We definitely didn't want to use standard object inheritance, because it already has lots of weird edge cases in off-chain code, and we were pretty sure that it would be an absolute disaster in the context of smart contracts. Imagine if I could define a sub-class of CryptoKitty that overrode the breeding method! Or a subclass of Vault that "extended" the interface to include a "makeMeRich" method! Doesn't exactly fit the use case... 😀

bjartek commented 3 years ago

@dete exactly. That was part of the reasoning behind my mixin experiment. Each trait would store all its method and state in a seperate «namespace» inside the NFT.

Will this feature be done before main net launches? Or if not how should we structure our NFT to be compatible.

In my case I want to store Art that could be unique or editioned. Either as a single «trait» or ad two.

ericelliott commented 3 years ago

Are there any existing examples of royalty tracking we can look at?

cybercent commented 3 years ago

I'm part of a team working on an open marketplace on Flow and we would need to deal with metadata rather soon 😄 .

After reading your comments, and in order to keep it simple and flexible for the first NFTs created and aggregated, my proposition that is in fact a sum of your propositions is:

Metadata presence

Metadata format

Examples on how to format data using schema.org and JSON-LD are available here: https://developers.google.com/search/docs/data-types/book#structured-data-type-definitions List of all the available schemas: https://schema.org/docs/schemas.html

Metadata storage

For metadata that is stored off-chain:

  1. a reference to the metadata file MUST be stored on-chain
  2. the metadata file MAY be named using the SHA2_256 hash of the file contents.

🔢 {metadata: "https:\/\/s3.amazonaws.com\/your-bucket\/your-folder\/{file-hash}.json" 💡 Metadata translations can be managed using the "workTranslation": attribute inside the metadata file.

For metadata that is stored on-chain

  1. no other requirements

Example:

Deciding between on-chain and off-chain storage A rough estimation of storage pricing is that 1KB of data would cost at least $0.01 to store forever.

Calculation method: Flow account creation costs 0.1 FLOW and comes with 1KB of storage FLOW token price during the community sale (this is the current minimum): $0.1

This is how 460B of metadata would look like:

{"@context":"https://schema.org","@type":"SportsTeam","name":"Seattle Seahawks","sport":"American Football","memberOf":[{"@type":"SportsOrganization","name":"National Football League"},{"@type":"SportsOrganization","name":"National Football Conference"},{"@type":"SportsOrganization","name":"NFC West Division"}],"coach":{"@type":"Person","name":"Pete Carroll"},"athlete":[{"@type":"Person","name":"Russell Wilson"},{"@type":"Person","name":"Marshawn Lynch"}]}

On-chain Storing this metadata on-chain would cost $ 0.0046. metadata: {"@context":"https://....."}

Off-chain Storing the same metadata off-chain, and only the reference on chain would cost $ 0.00125 as the data weights 125B. In this case, you need to keep in mind that you will have additional charges from your cloud storage provider.

metadata: "https:\/\/s3.amazonaws.com\/your-bucket\/your-folder\/94fb72c123f82db7b19a23dceb0bf60f7c1fdfa8726b53379c0dcfa63e3b8b3c.json"

Using a file hash is optional but encouraged. 94fb72c123f82db7b19a23dceb0bf60f7c1fdfa8726b53379c0dcfa63e3b8b3c is SHA2_256({"@context":"https://..."})

bjartek commented 3 years ago

I modified my mixing/trait example from above to model a Trait as a Interface. https://github.com/bjartek/flow-nft-mixin/tree/trait

turbolent commented 3 years ago

I think using some established standard for structured data such as schema.org is a great idea!

It might be a good idea to implement the schemas as types to increase type-safety, instead of encoding the data in the JSON data model, i.e. as dictionaries, arrays, strings, etc.

bjartek commented 3 years ago

I think that sounds like a really sound idea @turbolent, to use the type system for what it is worth.

ericelliott commented 3 years ago

@turbolent @bjartek What's the difference between using a schema like schema.org vs using types? Is a schema not essentially a description of a complex type?

rheaplex commented 3 years ago

@ericelliott the difference is where they are enforced. A json-ld string representing schema.org data cannot be validated easily onchain. A Cadence struct representing the same data can.

rheaplex commented 3 years ago

I'm a fan of Dublin Core and its descendants for metadata -

https://en.wikipedia.org/wiki/Dublin_Core

For royalty tracking there are the various rights expression languages -

https://en.wikipedia.org/wiki/Rights_Expression_Language

I'm a fan of ccREL (having worked for CC... ;-) ) -

https://en.wikipedia.org/wiki/Creative_Commons_Rights_Expression_Language

These are all much older than schema.org, and schema.org is seeing usage outside of web sites now which was the main differentiator for these standards (they are used in asset management workflows, etc.).

But this is all much grander than a simple core of metadata attributes.

ericelliott commented 3 years ago

@robmyers I'm interested in developing a cross-chain standard for representing NFT metadata so we can easily port our NFTs between chains. IMO, anything that relies specifically on the Cadence language should be considered a compile/mapping target, not a candidate for an NFT metadata specification. I recognize that this discussion is about how we should represent NFT metadata specifically on Flow, but IMO, we should start from the point-of-view of cross-chain interoperability and composability and take advantage of Flow and Cadence-specific features only after we've decided on a cross-chain representation for metadata. E.g., how do we express these ideas in a document that can be read by many languages?

RE: RDF & ccREL - most of the RDF and Dublin Core documentation is focused on expressing metadata in HTML or XML. We'd need to translate vocabulary and examples to use a more modern and accessible format, such as JSON - but before we dive in deep and start on that work, there is a lot of overlap between ccREL and Schema.org. Are there specific differences you want to point out? AFAIK, ccREL is very simple, exposing just a handful of vocabulary words related to expressing rights, which could pretty easily be adopted into a new metadata standard for NFTs.

AFAIK, the vocabulary of ccREL is actually a bit too limited, as there's no easy way to express multiple collaborators or royalty splits with ccREL - you get one attribution slot and one attribution URL. I suppose you could put a collection of collaborators at the dereferenced URL, but how to do that or interpret that is not included in ccREL. Please let me know if I got that wrong. I have been using CC since its inception, but I'm not an expert on the metadata specs.

rheaplex commented 3 years ago

http://internft.org/ is working on something similar I believe but I haven't had a chance to look yet.

For on-chain data, Cadence data structures are the neutral format. RDF (and schema.org schema data can be encoded as RDF) has multiple representations. These older standards are more rigorous and them being more minimal may be better for considering the requirements of an initial metadata standard on-chain that nonetheless goes further than ERC721 and clearly indicates how it might be extended.

I'd also look at PROV and MPEG 21 REL.

To be clear, I'm not suggesting that any of these be adopted as drop-in standards. :-)

cybercent commented 3 years ago

Having a Schema.org with built-in types would great.

As @robmyers mentioned, avoiding human errors and doing easy validation is a nice advantage. A second advantage could be that as on Flow a resource can own other resources, having the metadata typed could make it easier for multiple Dapps to interoperate as relationships could be built between resources based on the metadata.

I took a quick look at http://internft.org , apart from the first pages there is not much to see, the project looks abandoned.

Dublin Core it's nice and succinct, but that's the downside too, I don't think it has enough vocabulary to power the metadata for a virtual world like Decentraland so it might not be adopted by serious Dapps which would be a pity.

Other blockchains could adopt Schema.org for metadata interop with Flow. It's neutral ground and not specific to Flow or Cadence itself. They could build their own internal types to match the schema specs if needed.

bjartek commented 3 years ago

More information on schema evolution https://www.inkandswitch.com/cambria.html

MiyaSteven commented 3 years ago

I'm learning about Salesforce Multi Tenant Architecture and it made me think of this. I'm assuming this is what has been mentioned before by building relationships to access metadata. The only difference is the fact that these are NFT's but it's similar to this I think...https://www.youtube.com/watch?v=Tuy_O37H3O8

MagicIndustries commented 3 years ago

Have you considered using DIDs? Part of the standard is they must point to a DID Document, which allows for all kinds of structured and auto-discoverable stuff, including ownership private keys, schema.org json schemas, content source links etc.

DanMacDonald commented 3 years ago

Flow has a Collector Role, Consensus Role, Execution Role, and Verification Role, and much of Flows strengths are derived from separating concerns among those roles. What if there was a metadata storage role that handled the concern of file storage? The concern of this type of node would be provisioning cheap disk space (and appropriate bandwidth).

The analogous solution I'm thinking of is CDNs. A file is uploaded and distributed across the CDNs network of data centers in different geographical regions. A single URL is assigned to the file that can be used to retrieve it. When file requests are made the CDN routes requests to the "nearest" datacenter to the requestor and provisions the response from that location.

What if the problem isn't to find a way to embed metadata into Flow accounts directly but rather persist it to a CDN like Role Node and have the accounts store references to metadata on those CDN nodes, not unlike how git LFS works.

The benefits of this approach would be the ability to incentivize metadata/storage role nodes economically (like the other roles) and make metadata storage a native element of the platform that could be exposed though a consistent interface native to cadence.

victor-geere commented 3 years ago

IPVS

You mean IPFS?

victor-geere commented 3 years ago

A schema will just add bloat like SOAP. Let the developers of each project decide how to interpret their metadata. If I issue a contract then I know how to interpret the contract. A version number could be handy but that can be incorporated into the contract itself. JSON is perfect for metadata.

cybercent commented 3 years ago

@victor-geere the whole point of this is to allows others to use the metadata you set. Wallets, blockchain explorers and other Dapps.

victor-geere commented 3 years ago

Have you had a look at the NFT schema editor on https://wax.atomichub.io/creator?

GarrettJMU commented 3 years ago

NFT schema editor on

This is blocked content on my end. Can you paste a screenshot?

ericelliott commented 3 years ago

@DanMacDonald makes a great suggestion. If we could give smart contracts on flow first-class access to that data without oracles, that would be a significant advantage over "just use IPFS", and no, IPFS is not an adequate solution to use in place of a CDN. IPFS needs CDNs on top to make it adequately performant.

See also: Filecoin, Storj/Tardigrade, Theta Network, SIA. If a blockchain node could read data from a storage layer and use it in smart contracts, that would be amazing. Hashes can be verified for security, but we'd need be sure the hash functions are resistant to preimage attacks.

0xJayShen commented 3 years ago

1.We must standardize the most basic metadata fields in order to implement cross-chains with other chains in the future 2.Now that there are storage chains, we should store data such as pictures on the chain, and we can make which chains will be recognized, such as IPFS,AR(we can talk about that) 3.If the second recommendation is denied,the metadata will be outside the chain, so we should index the data, and the index should be stored in the flow chain 4.We should also have a specification for the name of the function that reads metadata

aturX commented 3 years ago

Have you considered using DIDs? Part of the standard is they must point to a DID Document, which allows for all kinds of structured and auto-discoverable stuff, including ownership private keys, schema.org json schemas, content source links etc.

https://ceramic.network/ They have a good practice about DIDs..