Closed trentmc closed 2 years ago
These are the steps needed for a contribution in the "Unleash data" category:
From this point on this is a pure profit endeavour as the participant will then make $OCEAN on each sale of the data asset.
A proposal to the category will then provide funding to pay for these steps. With Ocean Data Farming these sales will then be additionally incentivised another time.
This will then add even more $OCEAN on top of all data sales.
From the other categories point of view they are not directly leading to profit + Data Farming rewards. So applying to this category and trying to hit datasets that sell is highly lucrative as it does not entail any risk on the behalf of the proposal creator but potentially yield maximum returns.
The new rules for the category are asking for a proof of sales before further investing into a project that wants to do this. As the process to get to a good amount of data for sales can be very expensive a lot of investment might be needed for such a project.
I would suggest to ask for some proof of usefulness and try to make it necessary to have positive user feedback for the data before investing more into a dataset. And also asking a dataset that wants to be created to estimate the total costs for it, going round by round is going to be difficult for voters to understand. Also the benefit for the ecosystem with one more listed dataset is relatively small compared to other tech building more generalised solutions on the protocol.
A discord poll has been launched for this issue. https://discord.com/channels/612953348487905282/776848812534398986/920221130701275156
Discord discussions relevant to this issue.
Tobias @tmanthey link:
What worried me today is that Ocean rather burns a Million USD than fund data projects.
Ocean is giving up its fundamental grant category. The only category that builds the "data economy" rather than tools for the data economy.
Imo double standards are applied when claiming that data projects did not yield data consume, while ignoring that most if not all other projects did not yield hard return either. So following the the argument: "77k did not yield data consume, so why should we fund it" could then basically applied to many projects that consumed even more.
If the fundamental concepts on how the data economy works are not explored, what tools are we building? I'd like to reference my proposal I made in the data-consume wg. [see below]. Refine "unleash data" category to build high quality promotional data assets for 1 Ocean that then get distributed by the teams incentivised by data farming rewards
Tobias @tmanthey link My proposal is to: 1.) Refine "Unleash data" category to "Promotional data". Data under this category must be published on Ocean Market for e.g. 1 Ocean => This has a first positive effect on the DAO as it increases the utilization of the available funding, adds more projects to the DAO and leads to more competition 2.) "Promotional data" funding synergizes with Data Farming Creating data and publishing is not enough for DCV. What we need an incentive to actively promote and support the published data. For this aspect data farming is the perfect incentive. To avoid double dipping I would deduct farming rewards from grant requests. But this creates in incentive to create farming rewards higher than the grant funding cap per project
The following messages are replies to the previous message.
Tobias Manthey ā 12/06/2021 This is also the distinction to the proposal of Scott. If we just buy the data we don't solve the problem of an active promotion and support of this data. Currently this is a requirement as buy side is currently not present on ocean market. I strongly discourage the idea of putting arbitrary data on market. To be valuable data must serve a purpose. If you can't define the purpose of your data it's almost guaranteed garbage.
Data is the AI equivalent of "sourcecode". A valuable program cannot be created by grabbing arbitrary open source code. The value of source code comes with the problem it solves as well as the team that understand what it does and how to adapt it. Idiom | Ocean ā 12/06/2021 V4 is tacking issues upstream with token price, liquidity provision, and other details, that keeps the market from getting to DCV effectively.
I believe information has been shared on the designs of V4 for a while, but would recommend you look at different channels.
Scott M3 ā 12/07/2021 Ideally we need multiple marketplaces serving many different markets built on Ocean protocol. Someone could make a very successful data marketplace which sold data sets for no more than $20. The datasets wouldn't even need to be that complex. All they need is data providers and lots of buyers. In order to create the next 'amazing AI data marketplace', one approach is to pay to create incredible data sets and then list them on the marketplace for sure. However, if you were to create a new real estate agency, you probably wouldn't build 10 mansions and then try and sell them.
Tobias Manthey ā 12/07/2021 What's someone going to do with e.g. 20 annotated images for $20? It's basically impossible to create a larger dataset based on many small ones. The result would be so inconsistent that it is not of value to anyone.
Then: a 21-message sub-thread with Tobias and Scott here.
Tobias Manthey ā 12/07/2021 To compare the AI & data market to established markets like housing or gaming is a mistake imo. e.g. if you want to buy a house you know the standards and where to look for offers
That's not the case for AI & Data
"Data" is not standardized whatsoever and traded basically exclusively OTC. The AI industry is immature and also completely lacking standards of any kind.
Data markets are not established as a source of AI data. So if you would buy 10 valuable datasets you'd still have the problem that they are unlikely to be found as nobody considers using data markets.
Also I'd compare data more to source code. While a house simply has its value, the value of source code depends also depends on the team that can maintain and adapt it.
That's why I am in favor of building data. We not just need data, we need also sales. Data is typically very industry specific. You have to have the contacts within the industry to sell it and with data farming we give a good incentive to do so
What worried me today is that Ocean rather burns a Million USD than fund data projects.
This is a valid concern.
There's intense discussion to change this to "recycle" instead, see:
From the sentiment in that issue / poll, the consensus seems to be to recycle not burn.
Discord poll for this topic can be found here.
Here's what I see.
"Unleash data" was never meant to be only to "get a grant to construct/improve your own handful of datasets, to sell". It had (and has) broader intent, such as: become a data broker to get data owners' data into Ocean Market, integrate Filecoin & publish on Ocean Market, etc. If the the community was to go for this broader intent, then it's more viable to un-bound "unleash data" with the same parameters as other categories. Teams can still submit "construct/improve"; though if a lot of OCEAN is requested to create a handful of datasets that might be sold for $$, OceanDAO may voters think it's poor value-add then they'll vote accordingly.
Details:
Datapoint: "unleash data" category was never intended as solely "get a grant to construct/improve your own handful of datasets, to sell". It was actually much more open-ended than that, you can see examples in the OceanDAO wiki Proposal Ideas under "Unleash Data" Category. This includes two big categories:
(Details in the bottom half of this comment)
Tobias' suggestion of "promotional data" fits nicely into this too. There can even be free data - eg unlock the 20K datasets of OpenML - given that there can be value-add (sell algorithms that run on top of the data; provenance of data + compute; etc).
"Unleash data" in such a way" is clearly an important category to the community: a working group has even been created solely to focus on it! ("Data Consume Volume WG").
In practice, the only teams that had applied in this category were doing just one thing: "construct/improve datasets". It was this "in practice" fact that led to "unleash data" being constrained as a category.
What I see is: if the community was to go for the broader intent of "unleash data" such as becoming a data broker, integrating Filecoin, etc rather than just "construct/improve" datasets, then it's much easier to re-open "unleash data" to have the same parameters as other categories. It still includes "construct/improve", though if a lot of OCEAN is requested to create a handful of datasets that might be sold for $$, this is probably poor value-add and OceanDAO voters may not take kindly to it.
With these expectations, unleash data gets treated like the rest of the categories. No longer bounded.
Details 2: From OceanDAO wiki, Proposal Ideas:
Business / networking approaches
Integrations with data services
The idea is that you can unleash more data via integrations into other data services, such as Filecoin or Chainlink. The integration should make it easy to buy & sell access to a specific data asset as an Ocean datatoken. One good way to implement ideas below is to fork Ocean Provider code, make the change, and tune things for good UX.
The Parameters & Roadmap WG met yesterday to finalize this issue. Discord votes are taken as an input sentiment, not as the final say. The final say is in the WG.
Discussion:
Tobias: (repeated proposal on āpromotional data"; and) Data farming wonāt be a while yet
Trent: (read out summary of thoughts). I therefore voted for making it par
Robin: Itās good to be able to have projects fund themselves. Double dipping is obvious, but itās also fine, probably not so heavy. When DF comes, there will be fewer of these. If Iād try to aggressively abuse this system, Iād get OPF to pay me to create the data.
Tobias: suggest add rule: if get $ from unleash data to create datasets, then if team comes back to the DAO for grant requests, then deduct DF rewards from grant request.
Trent: how to implement? Might be hard
Trent: if we did go back to par on this, good to communicate the expectations. "Unleash data" was never meant to be only to "get a grant to construct/improve your own handful of datasets, to sell". It had (and has) broader intent, such as: become a data broker to get data owners' data into Ocean Market, integrate Filecoin & publish on Ocean Market, etc
Tobias: datapoint: first commercial dataset on Acentrik was funded by the DAO. We left a good impression there. This is the first return of the dataset funding back to the ecosystem.
Robin: plus the C2D stuff on Tobiasā datasets (Evotegra)
Roberto: Iām happy to open this up. Important to guide projects towards constructive outcomes. I fear that as we invest in datasets, we shoot some cannonballs rather than try different things. Iād hope that it funds other activities: rather than gathering / sharing / data, itās for things like data union models for people to join the dao. Or rather than paying for the data, we move into models perhaps working tech for people to gather the data. Move towards mining and providing right incentives for participants to share the data. āFire bullets, not cannonballsā
Trent: I would personally love to see the 20,000 OpenML datasets and 1000 HuggingFace datasets and models
Robin: but these are open
Trent: yes. But USP when combine with C2D (priced compute)
Roberto: be more broad. Some models arenāt ready to be consumed for data science. Would be great to have a bunch of data from various subgraphs of TheGraph into here. This would take a lot of effort to put together. Eg I saw a group that dumped all mirror from Arweave into a repository. I really want to promote us having more integrations, raw data or otherwise, to provide the supply where the rubber meets the road. Bullets not cannonballs.
Tobias: my approach is looking through glasses of AI company, from perspective of potential consumers. Useful to have a handful of promotional datasets.
Trent: flames are around speculation (datasets about datasets, datasets about NFT speculation), letās incentivize that more
Roberto: +1 to the above. Many of us are in many OceanDAO WGs. Weāre in a good position to provide good feedback, continuously shepherding initiatives.
Q: if make āunleash dataā par again (same conditions as other categories), and communicate the expectations that itās more broad (see above), any objections?
Tobias: in time we have to adjust by DF rewards
Q: if make āunleash dataā par again (same conditions as other categories), communicate the expectations that itās more broad (see above), and update before DF, any objections? (no objections raised)
Decision: make āunleash dataā par again (same conditions as other categories), communicate the expectations that itās more broad (see below), and update before DF, any objections
Expectations to communicate:
Discord poll: https://discord.com/channels/612953348487905282/908016760190537798/920222881378607135
Background "Unleash data" is a category side-by-side with "outreach" and other categories.
As of R11, "unleash data" funding got restricted compared to other categories. It was like that for R12 too.
Alternatives
Approach 1 - status quo - restricted
Approach 2 - Make "unleash data" par with the rest again.
Approach 3 - other? Suggestions?