Open godfreyhobbs opened 6 years ago
Nice!
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
This issue now has a funding of 0.7 ETH (419.71 USD @ $599.58/ETH) attached to it.
Only a 2 week window for this — closed already?
16 days left
I can see that for x = IRISmax
and y = IRISmax
, f = IRISmax
/3 which seems counterintuitive. It's also quite counterintuitive that own user's attestation counts (that) negatively. Max for f
is when y
= IRISmax
and x=0
, In which case f
= IRISmax
. Again, a little bit counterintuitive that in order to maximize my IRIS score, I should have 0, IRIS(user_x_appreciation_of_data)
. To make f
zero, you have to have x= IRISmax
(I guess that's a malicious submission) and y
= 0, i.e. the crowd believe it's useless. This makes sense.
So I made a couple mistakes here. In the formulas above, instead of:
IRIS(attestation_for_data_of_user_y) = { IRIS(user_x_appreciation_for_data_of_user_y), if no IRIS(other_users_appreciation_of_data_of_user_y) ; f(median(IRIS(other_users_appreciation_for_data_of_user_y)), IRIS(user_x_appreciation_for_data_of_user_y)), otherwise}
should be:
IRIS(attestation_for_data_of_user_y) = { IRIS(user_y_appreciation_for_data_of_user_y), if no IRIS(other_users_appreciation_of_data_of_user_y) ; f(IRIS(user_y_appreciation_for_data_of_user_y), median(IRIS(other_users_appreciation_for_data_of_user_y))), otherwise}
The idea behind f(x, y) = (-2x + y + IRISmax) /3 is as it goes: The problem of the value of information (carate of diamonds, ...) is a dual issue, like a Cartesian plane, where the Vertical axis representing Imaginary numbers is your imagination or your opinion of the value, and the Horizontal axis representing Real numbers is the reality check or other's opinion of the value of info.
Theoretically, the most valuable information, or person, or diamond, or whatever is that which considers themselves not of great importance, but others see great potential. The self-approximation is more important since the Real axis might not exist from the beginning.
You are by no means incentivized neither to super grade your data, neither to ultra degrade it. But you are incentivized to be careful in making your decision, which may result in lower self-adjustment, yet higher other-adjustment. This is because your IRIS score is first given by your own measurement => you would like higher self-grade. You would also like higher self-grade since you will be graded faster by others who see a great IRIS next to your profile because they also get that same IRIS when there is no other attestation than theirs.
Theoretically, the most valuable information, or person, or diamond, or whatever is that which considers themselves not of great importance, but others see great potential.
"Theoretically" based on which theory? "the most valuable [...] is that which considers themselves not of great importance" - I don't quite see that. I could see why something of great importance might be obvious to the owner as such. "others see great potential" - or they might not... this is "wisdom of the crowd" VS "wisdom of experts". It's opinion - not something we can easily conclude or verify. Except if there's some obvious theory I'm not aware of.
This is because your IRIS score is first given by your own measurement => you would like higher self-grade. You would also like higher self-grade since you will be graded faster by others who see a great IRIS next to your profile because they also get that same IRIS when there is no other attestation than theirs.
I can see some game-theoretic argument here, which uses, though, mechanics that I don't right now see anywhere defined. More specifically:
I don't get those fine details of this method. When I put whatever I understand into code, this is what I see:
You can find the code here and in the same directory there's the juypter notebook with the analysis. I used the original formula with the -2 * IRISmax
because with just -IRISmax
doesn't normalize properly in the 0-10
range.
Overall any such approach seems to me to have an optimal vote for the owner of the data. E.g. it's always optimum for them to vote 7 or 8 (right now the optimal seems to be 0
... but whatever). If they do so, then given arbitrary evaluation, they won't damage their reputation, but they won't damage their datapoint starting odds either. Both in the simulation and overall, we're overlooking the fact that nobody would buy a data-point with self-rating e.g. '2'... even if it's really trash and the median would reward it with actually... rating it as trash.
Assuming there's an optimum voting strategy for the owner (e.g. vote 8), the rest of the formula is just median. That's fine - but isn't immune to all real-life threats, e.g. lobbies that vote certain data-points and/or anti-vote other people or datapoints.
What Linia seems to need is Google's page-rank. There are four problems though:
a) Google's page-rank is too expensive to implement on blockchain b) Google's page-rank doesn't work in reality c) Google's page-rank with all the Google hacks doesn't work in reality in a trustless environment d) We don't even know if a single IRIS score can be defined
On a) - yes - page-rank should be a native ethereum operation along with other ranking mechanism. They will get there one day... but not soon. On b) link farms etc. make google page rank not work in practice. The way Google ranks is an ongoing process that keeps google's results of acceptable quality. It has 1000's of features for each page + quite a few of them are user-provided e.g. average time before you return to google. i.e. stuff that is quite more than just page-rank On c) Google in stone-age started with .edu and other domains having high seed rank before doing random surfing with teleportation. i.e. it wasn't trustless, but exactly the opposite. More recently, Google does tons of work of testing and manually adjusting ranking parameters to ensure safe and relevant results. This is by operators who are Google employees. It's not algorithmic. I believe that if there was an algorithmic solution to the problem, Google wouldn't employ people to adjust the algo. On d) - Google now has a huge feature vector for each one of us and matches stuff according to how good they are for who we appear to be. There's no single "rank" for a page anymore. In Linia's context, I can see different types of researchers to have different data needs.
Good book on the subject:
https://www.amazon.com/Whos-1-Science-Rating-Ranking/dp/069116231X/
The cover looks silly but there's quite a bit of good mathematics in there, and it's very well written.
Ok, I wrote some very stupid comments and then I deleted them.
I think you may be right. I'll have to come up with something different. I'll think.
@uivlis @satoshi101 Thanks for the thoughtful discussion. It is really awesome.
Yes, it may be best to have a set of domain-specific of IRIS scores.
It may be a useful exercise to pick a specific real-world use case and walk through how a domain-specific IRIS score would work.
@uivlis
Say I have a precious gem, the most precious in the world. Would I knock on everyone's door saying look upon my precious gem? Certainly not, for I would get stolen, even if they acknowledge that it is precious. However, if somebody saw my precious gem that I keep it hidden, they would require me of them seeing it a bit more. And they would desire it, even because I keep it hidden.
Implicitly we might see traces of a "marketplace" here. The idea that IRIS is the "fair price" i.e. a value that if someone pays, they can own that datapoint and then future cashflows will go to them sounds interesting. Effectively, every datapoint can be a non-fungible token with it's own track history of exchanges at different price-points. Far reached, but it could work. If one can't "steal" the gem and own future cashflows or future appreciation, then it doesn't matter if you show it or hide it. :) But if ownership of the data changes, then it's a whole different story.
@godfreyhobbs - I think that you mentioned in the past, the idea of "IRIS score" providers that have different levels of trust or credibility? - Which is not that decentralized of course, but it might be realistic. Something like "credit rating agencies".
Overall, this problem - for me - is way to complex for a bounty, let alone one with such broad and strict-looking requirements. I would be very surprised if a "final solution" was found here.
Just to mention that the vision described by @satoshi101, derived from my aphorisms, of Linnia being a marketplace of data implies that there is no IRIS beforehand, but rather that "the marketplace" is the IRIS and decides it.
implies that there is no IRIS beforehand, but rather that "the marketplace" is the IRIS and decides it
Correct. It's quite challenging to setup a functional marketplace though. At first order, all you seem to need is an ASK and potentially a BID IRIS in IRIStoken
s. Then you need to have a record of at least the last transaction (if any). Ownership of the datapoint should be transferred, whenever ask/bid prices cross, to the new owner.
Specifically in the case of Linnia, a data point might be leased to a user for single use. This has a price set by the owner. The owner has incentive to set a "fair" price compared to the alternatives so that people lease the datapoints. Otherwise competitors will be leased more often than others. If someone plans to lease a datapoint many times, or if they believe that the datapoint is valuable and likely the market will appreciate that, they will want to buy instead of lease it. For the first transaction, we expect that someone to lease a datapoint at least once before buying it. Otherwise they will trust the original owner and the quality of the data they provide. Someone might decide to buy the data without seeing them, but we expect most people to want to first see/use the data, before they decide to own them. Once someone owns a datapoint, they might decide to amend the current lease price. The owner is the receiver of IRIStoken
s from data leases.
Note that the original issuer of the data, loses control of those data, permanently, as soon as the first transfer of ownership happens. i.e. they can't reclaim or offline the data. There might be a provision to be able to offline the data if you buy them back (potentially at some other IRIStoken
price point) and you're the original owner. Generally we don't expect data and their history to go away.
This is the basic fabric of a marketplace based on datapoints as non-fungible tokens (à la kryptokitties) + a liquid crypto-token IRIStoken
. It seems to me like those rules could be coded on a smart contract. I'm not sure how efficient/scalable an implementation could be. The non-fungible token standard ERC-721 , could be a starting point in terms of interface. One thing that is a bit worrying is that many of those "tokens" might be trash or spam tokens. Someone or a farm might create millions of fake datapoints to collect the first lease. Another reason people might create or even trade fake datapoints (in order to increase their apparent value) would be to influence research by injecting tons of fake data that represent for example themselves or some distribution of choice. For example, people might try to associate white caucasians with high blood pressure by forging data. This means that we still have the original problem where normal attestations from doctors or issuers comes from (along with the associated centralization). Meaning - that for example a blood sample datapoint must be entered to the system or attested by an accredited healthcare provider. Could a healthcare provider collude with real or manufactured patients to manipulate data? Highly likely but it might be too much trouble. On the other hand, incidents like these ["Fake Clinics Outnumber Abortion Providers 10 to 1 in Texas"] mean that it's not impossible that medical legal entities collude to push an agenda. Another issue is that if the ratio of market participants to datapoints is, for example, 1:1,000,000, it will be really hard for this marketplace to retain efficient market assumptions. There might be points in time during the development of the system where those conditions hold true. At that point trash and spam datapoints might easily overwhelm market participants who won't have the time nor money to evaluate all those datapoints. All those issues seem to make such a system lean towards measures for evaluating market participants too, e.g. banning bad actors. This would be damaging to decentralization, and could substantially increase the complexity of implementation, but this might be inevitable.
It's quite challenging to setup a functional marketplace.
Currently, with this and other bounties, we are trying not to start with a given solution but allow the Linnia Community the freedom to come up with new solutions that we have not already considered.
@satoshi101 Thanks for your insights. Linnia will primarily be about empowering individuals so data will be leased
not purchased outright. In this way, the individual will always be able to revoke
access. The revoke
action may happen manually or be triggered by a policy
(#35).
That said, consortiums may form and act as proxies for a large number of users. Again, the concept of policy-based sharing premissions become critical (bounty Issue #35 ). Consortiums could be either of the following:
trusted
centralized actorsI don't understand exactly what the above means.
I can understand an element of "don't sell" and there should be the right to revoke
. This makes even more difficult to create an efficient marketplace with IRIS score as "value unit". A marketplace might be an option out of the corner of the 'carat' corner 1 and all those "specific model for each domain" requirements that doesn't take any advantage of blockchain and it's probably just a matter of 5-10 years of consortium work with paid experts from each domain you are targeting.
Actually in the context of the whitepaper, I don't even see what's the requirement out of this bounty.
There's also some contradictions, in the whitepaper:
i.e.
On the code, records have IRIS and users have provenance.
@satoshi101
consortium
as part of a data marketplace
When I mentioned consortium
I was thinking about something like the data-labour union
mentioned in this article. The consortium
may help make the Linnia marketplace fair and efficient.
IRIS score.
It is likely that the consortium or data-labour union
would not play a role in the IRIS score algorithm.
@uivlis Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!
Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days
@godfreyhobbs is this one good to pay out to anyone or should I cancel the bounty?
@uivlis @satoshi101 @tcsiwula we have created an interface to clarify how IRIS fits into the linnia protocol.
Here is the update with some tests: https://github.com/ConsenSys/Linnia-Smart-Contracts/pull/101/files
@satoshi101 You comment was very helpful:
There's no single "rank" for a page anymore. In Linia's context, I can see different types of researchers to have different data needs.
I have introduced the following mapping to the linniaRecords.sol.
mapping (address => uint256) irisProvidersReports;
The irisProvidersReports
will allow many different providers to each use their own algorithms. This means that there is no single "rank"
. Instead, each researche can choose a different set of weights for each irisProvidersReports.
Context
Linnia is a core component of the future of the web; Web 3.0. Linnia is a new Ethereum Blockchain protocol that brings the power of decentralization to your lifetime data. The Linnia protocol provides the foundation for secure decentralized applications in multiple spheres including the sphere of electronic healthcare records.
What
We would like to incentivize you, as a member of the Gitcoin/Bounties family, to innovate and help create the Linnia IRIS score algorithm. The IRIS stands for InfoRmation Integrity Score. The Linnia IRIS score is a critical part of the linnia protocol. We will be awarding the submission for this task a .7 ETH bounty, assuming the below requirements are met. The proposed Linnia IRIS score algorithm must be in the spirit of the following two Linnia papers;
In particular, the following section describes the IRIS score:
Note: Linnia is a WORK IN PROGRESS. The Linnia smart contracts are only a small subset of what is described in these papers.
Our Ideas
Requirements
game
the IRIS score algorithm