filecoin-project / notary-governance

114 stars 58 forks source link

Modification: Combatting Fraud and Maintaining Integrity: Proposals for Ensuring a Valuable Filecoin Network #813

Closed herrehesse closed 3 months ago

herrehesse commented 1 year ago

Issue Description

Dear Filecoin Community,

In the current Filecoin ecosystem, datacap is a highly sought-after commodity. As a result, it is not at all surprising that entities try to amass this commodity without adhering to the program rules. This is particularly true for entities that are in a financial bind or operate large cloud computing operations with mainly CC sectors. These entities will often go to great lengths to convert their sectors to ones that contain datacap deals. Although this behavior is not surprising, it is noteworthy that after extensive research on the LDN's, only a small percentage of large miners actually engage in fair practice. Most of them have chosen a more fraudulent path to growth, largely due to the lack of adequate supervision.

The responsibilities and expectations of a storage provider(SP) who operates with good intentions include:

It is a fair and equitable exchange for storage providers who meet the established requirements and actively fulfill their responsibilities to receive support and increased revenue from the community in the form of a multiplier.

However, the majority of storage providers (SPs) do not adhere to the responsibilities and requirements mentioned above, particularly when it comes to maintaining high standards of data integrity and availability. This trend needs to be addressed and rectified. It is important to keep in mind that the scope of data storage and network power being discussed is not limited to mere terabytes or petabytes, but rather extends to an exabyte and beyond. It is a more significant issue than one may initially realize.

It has been made alarmingly clear that certain large storage providers have been colluding to make hundreds of datacap requests for their own benefit, effectively converting their entire mining operations to verified sectors with the multiplier. This is not only a flagrant violation of the rules, but a brazen act of fraud and theft, stealing revenue from all the entities in Filecoin who are working hard to play by the rules. This behavior is unacceptable and must be addressed immediately.

Impact

It is disheartening to see that while many genuine companies and businesses are working tirelessly to build and contribute to the Filecoin ecosystem, there are bad actors who are exploiting the system for their own gain. Take PiKNiK for example, a company that has invested millions to recruit new storage providers and make the network more valuable. Or SEAL Storage, a company that has always been committed to storing humanity's most important information at storage providers around the world, with a focus on high-quality research data. And let's not forget the countless software companies that are investing in the protocol to add value and make the ecosystem more accessible for businesses and individuals.

These examples illustrate the hard work and dedication of many players in the Filecoin ecosystem. However, it is a harsh reality that not everyone shares these values. We must stop assuming that everyone has good intentions and start looking at the facts. The fact is that many datacap requests are not being made with the best intentions, and it is the responsibility of the applicants, data preparers and storage providers to prove otherwise.

Datacap is a valuable and scarce resource that should be used to benefit the entire Filecoin network. Its abuse not only undermines the integrity of the blockchain, but also undermines the hard work of those who are trying to build a better, more valuable ecosystem. It is our duty to hold those who abuse the system accountable, and ensure that the value of datacap is protected for the benefit of all.

Proposed Solution(s)

It is our recommendation that moving forward, datacap should only be allocated to applicants that have been proven to operate with good intentions and that strict adherence to the following rules should be upheld:

In our view, it is essential that there is a clear and defined set of expectations and requirements for entities seeking datacap. Currently, many of these expectations are left ambiguous in an effort to be inclusive, however, it has become evident that this approach has led to widespread abuse and is detrimental to the growth and success of the Filecoin network. The time has come to establish clear and specific guidelines to ensure that only well-intentioned entities are able to acquire datacap and contribute to the network's progress.

Proposed guidelines for the Filecoin+ program:

A separate but critical aspect to consider is the definition of "valuable" data. The FIL+ program is intended to support the storage of humanity's most important information, as stated in its description. However, the determination of what constitutes "valuable information" is subjective and can vary among individuals and organisations. Therefore, it is imperative that clear and specific rules, rather than guidelines, are established to define what data is and is not acceptable for the program. This will ensure that the program's purpose is upheld and that all entities understand the expectations and requirements for participating in the FIL+ program.

We believe that the FIL+ program should have strict requirements for applications, as the datacap provided in return is extremely valuable. It is only fair that in exchange for this valuable resource, the Filecoin community should expect a significant contribution or added value to the network from the applicant, data preparer, and storage provider. If the applicant is not able to meet these expectations, the community is more than willing to assist with regular paid deals. Allowing non-valuable data to be stored through the FIL+ program would dilute the program's purpose and undermine the efforts of storage providers who are truly committed to this mission.

Moreover, by requiring other data to move through regular paid deals, it ensures that the scarce and valuable datacap is being utilized efficiently and effectively. This helps to ensure that the program is being used for its intended purpose and that the resources such as datacap are being directed towards preserving and storing important information. In addition, it also ensures that the network is not overcrowded with non-valuable data, which can bring down the overall quality of the network.

Examples of data that can be considered valuable and contribute to the Filecoin network include:

It is with great disappointment that we have discovered, after thorough research, that a majority of applications submitted are not genuinely focused on storing valuable data as defined above, but are instead primarily focused on gaining access to the multiplier benefits provided by the FIL+ program.

We have found and should not allow instances of the following:

These fraudulent activities are not only a violation of the integrity of the network but also have no intention of contributing to the Filecoin network's goal of storing humanity's most important information. It is imperative that these types of applications are not supported and strict measures are taken to prevent them from accessing datacap and diminishing the value of the network.

This situation is unacceptable and cannot be allowed to continue. The purpose of datacap is not to enable fraud, but rather to reward storage providers who are committed to storing humanity's most important information and contribute to the overall integrity and quality of the network. Fraud should be kept to a minimum, and all entities should be held accountable for their actions.

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently. This would prevent them from receiving the long term rewards associated with their actions and prevent them from continuing to harm the integrity of the network and the broader Filecoin community. It is essential that this type of behavior is not tolerated and that the network is protected from those who seek to exploit it for their own gain.

In conclusion, it is important to remember that we are all here for the same goal: to make decentralized storage a reality and create a more valuable Filecoin network. Any actions that undermine this potential, whether it is through fraud or other unethical behavior, must be held accountable. We must work together to ensure that the network is protected and that the scarce resources of datacap are used efficiently and effectively.

Let's remain civil in our discussions and be transparent in our intentions and actions. In this way, we can work towards achieving the shared goal of creating a more valuable and sustainable Filecoin network for everyone. We welcome and value all feedback and opinions regarding the proposed rules and guidelines outlined above.

NSC-FIL commented 1 year ago

This proposal is well defined and touches on many issues discussed in the early days of the Filecoin+ program development in particular: 1) How to incentivize the storage and availability of "humanities most important information". 2) How to audit and punish violators of the mission and defined rules regarding Filecoin+ (i.e. a tribunal of notaries / ecosystem stakeholders that determine if infractions have occurred and how to penalize the violators). 3) A "credit system" of clients,SP's and notaries that are participating in the Filecoin+ program to determine "good actors" and "bad actors". We have several for overall SP reliability, but not specifically for all Filecoin+ participants.

Many of the points made in this FIP proposal are valid and worth moving to specific FIP proposals and phasing in each step while prominently announcing each to the wider community (preferably in translated native languages). This would help alleviate any shocks to members of our ecosystem. Preparation of moving Filecoin+ to L2 after the FVM is launched, could be a later phase IMO, but immediate attention to guide notaries, clients and SP's, particularly developing an audit/penalty mechanism is needed ASAP.

One suggestion that I would modify is the requirement to have boost for retrievals. Any boost-like method that allows retrievals at reasonable levels (which would need to be clearly defined like in MB/sec ranges) should be sufficient IMO.

hyunmoon commented 1 year ago

The use of Virtual Private Networks (VPNs) is strictly forbidden, as it makes it impossible to track and verify data distribution.

As long as one's VPN server is placed within their country, I think it should be considred OK so they can defend themselves against DDOS attacks.

Other than that, I really like the specific examples provided.

flyworker commented 1 year ago

Cancel the FIL+, user need to pay, problem resolved

hyunmoon commented 1 year ago

Cancel the FIL+, user need to pay, problem resolved

That has been my point of view all along but seeing an improvement attempt like this makes me hopeful about the program for the first time.

xinaxu commented 1 year ago

Requiring storage providers to provide their company name and proof of location to prevent the use of virtual private networks (VPNs)

VPN is okay if the storage provider discloses their actual location and those actual locations complies to the distribution policy as defined in this thread. The goal is not to forbid VPN, but disallow storage providers to use VPN to disguise their actual location and use it as a tool to violate the rule. There could be valid use case of VPN to encrypt the traffic, etc.

herrehesse commented 1 year ago

@xinaxu agreed. Will edit.

xinaxu commented 1 year ago

Is this proposal already approved within the T&T working group? or are you proposing this to be reviewed bythe T&T working group? How do you plan to drive this to some degree of concensus across the community and notaries? The proposal contains lots of bullet points and the detail of each one is up for debate. (i.e. why 6 replicas not 4)

Reiers commented 1 year ago

Hi @herrehesse ! I think it would get more 👀 , feedback - if this was opened in: https://github.com/filecoin-project/FIPs/discussions Having this under issues in between all the requests, it's not optimal.

Please post there, whenever you are ready - and I will join in then 👍

herrehesse commented 1 year ago

Hello @Reiers friend! A formal FIP draft will be finished by Friday and I will definitely post it in the /discussions channel!

Thank you for the reminder.

herrehesse commented 1 year ago

@xinaxu

Is this proposal already approved within the T&T working group or are you proposing this to be reviewed bythe T&T working group?

How do you plan to drive this to some degree of consensus across the community and notaries?

The proposal contains lots of bullet points and the detail of each one is up for debate. (i.e. why 6 replicas not 4)

panges2 commented 1 year ago

@herrehesse

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently.

From a tooling standpoint, the removal of datacap from clients is already being developed right now. We're putting greater priority on this, seeing now its importance.

@Reiers, I agree this should be in discussions. Also if you post there, there won't be spam from the LDN bot 😂

@NSC-FIL trying to build exactly those things. Expect more discussion posts about this coming soon. building a Incentive and credit systems for notaries has been top of my mind, and I'm open to discussing the ideas you have so far. if you want to reach out on FF slack: @Philippe Pangestu

alchemypunk commented 1 year ago

https://github.com/filecoin-project/notary-governance/issues/813#issuecomment-1387273258

Cancel the FIL+, user need to pay, problem resolved

See through the essence at a glance. But the community will not let you do this, they need short-term benefits to survive.

kernelogic commented 1 year ago

I agree most of the points but also agreed on the VPN: it's not all evil.

  1. To have a firewall
  2. To have better retrieval speed
  3. To escape government sanctions

And since we are talking about numbers here, I'd like to see max 3 copies per city allowance. The 1 copy per city is too restrictive.

Also we need to consider the number of SPs in each continent. Continents containing higher number of SPs should be allowed to allocate more copies there. It is not practical to allocate equal copies in every continent.

SBudo commented 1 year ago

Agree on the proposal with exception to:

xinaxu commented 1 year ago

Proposed guidelines for the Filecoin+ program

I suppose this the guideline is a recommendation but not mandatory. Lots of points below can easily have exceptions

A minimum of six data replicas must be maintained to ensure data redundancy and availability.

Originally, this is 4 replicas. Is there a reason to increase it to 6. Are we talking about 6 different minerIds or organizations or locations. Also, we also need to define what 6 replicas mean, i.e.

  1. Each unique data (pieceCid) must be sealed by 6 different entities(minerId/organization/location)
  2. Each entity(minerId/organization/location) cannot be using more than 20 % of total datacap

Data must be distributed across a minimum of three continents to ensure geographical redundancy and accessibility.

I think that's too strict. Some clients require data to be stored inside their country. IMO, a single continent is fine as long as it is stored in different locations.

Miners must be reachable at all times, with a minimum uptime of 98%, a maximum of 6 days of annual downtime

IMO that is really up to the client to decide but we can provide a minimum bar. Since you already mentioned 6 replicas, 90% up time for a single miner will give you 99.9999% up time so I'd either remove this requirement or relax it.

We have found and should not allow instances of the following

Lots of great examples, right now we are relying on notaries to do due diligence and each notary are doing it differently. Should we instead agree on datacap application verification workflow so all below can be avoided (like E-FIL+)

In addition to your proposal, I think a notary voting system will be really useful to reach agreements like whether to revoke a notary or client, or whether 1PiB of open garage video is considered valuable

DLTX-Github commented 1 year ago

I have to say that the amount of effort and time you have put into the investigations and formulating all communications including this proposal is nothing less of legendary work. Kudos for that, and thank you for pushing so hard to help make the filecoin network better

herrehesse commented 1 year ago

@SBudo @xinaxu - Edited the VPN usage part.

herrehesse commented 1 year ago

@Kevin-FF-USA @raghavrmadya Can someone disable the bot here?

herrehesse commented 1 year ago

@xinaxu

I suppose this the guideline is a recommendation but not mandatory. Lots of points below can easily have exceptions

Originally, this is 4 replicas. Is there a reason to increase it to 6. Are we talking about 6 different minerIds or organizations or locations. Also, we also need to define what 6 replicas mean, i.e.

I think that's too strict. Some clients require data to be stored inside their country. IMO, a single continent is fine as long as it is stored in different locations.

IMO that is really up to the client to decide but we can provide a minimum bar. Since you already mentioned 6 replicas, 90% up time for a single miner will give you 99.9999% up time so I'd either remove this requirement or relax it.

Let's not forget, we can set high standards for this program as the incentive we give away as a community is extremely valuable and profitable. We CAN ask for things that are not easily done or difficult to execute. If entities do not want to adhere, then opt for regular paid deals? If you want free data storage and a revenue stream, then follow the high standards and rules set for the program.

Valuable, retrievable, usable and distributed data storage in exchange for datacap. That is the tradeoff and nothing less.

cbtan21 commented 1 year ago
DaYouGroup commented 1 year ago

As a libertarian, I don't think more demands are beneficial. This would also restrict anyone who wants to join, which is probably bad for filecoin. I hope that filecoin can be like the early Bitcoin and ETH, and anyone who is interested can participate and contribute with a very low threshold. I think by making filecoin retrievable, these problems will naturally disappear.

flyworker commented 1 year ago
  • we should take regular deals out of this conversation. the data shows that less than 1% of the deals are regular deals, suggesting no one is doing that

the reason the regular deal is less than 1% is because of Fil+, people will pay if there is no Fil+, in 2021 98% are regular deals, to me the current economic model is unhealthy, SP pays for storing data.

cbtan21 commented 1 year ago

precisely my point of taking it out of the conversation because now we have to debate regular deals vs Fil+, instead of the governance of Fil+

the reason the regular deal is less than 1% is because of Fil+, people will pay if there is no Fil+, in 2021 98% are regular deals, to me the current economic model is unhealthy, SP pays for storing data.

I cannot in good position debate on regular deals vs Fil+ as I have too many questions that are unanswered:

  1. 98% of how much data? is that a large enough sample size? my understanding is deals grew around 20x to 500 PiB in the past one year. Are we talking about 98% of 25 PiB vs 99% of 500 PiB?
  2. of the 98% - are they all valuable / actual data? was there checking involved? (this discussion thread centers around the checking aspect.)
  3. why was Fil+ rolled out if regular deal was working?
  4. regular deal less than 1% currently because Fil+ works as intended or because of other reasons?
  5. On “people will pay if there is no Fil+" --> they can still pay even if there's Fil+ (nothing stopping them now); if they were paying and are currently not, does it suggest the driver behind their actions is primarily economical, instead of other reasons like security, stability etc. If SPs were attracting these target users using lower price vs centralized solutions in the first place, then isn't it even more attractive to target same users using negative pricing, which is happening / doable now because of cryptoeconomic incentives. Also, if decisions were made based on economics, then the market decides on the pricing, which is a result of demand and supply. Hypothetically, if Fil+ is at 10E now, i suspect the market may price it very differently
dkkapur commented 1 year ago

@herrehesse thanks for putting this together, great to see something concrete that we can build off of as a community.

As you know, I personally have tried to set a higher standard for data onboarding through policy modification and adjustment on the Slingshot side, but have stayed away from pushing the same constructs on all verified deals. The main reason for this, as you alluded to, is that it is very hard to generalize and apply a standard set of rules across every case deemed to be useful across the world. The point of having regionally distributed notaries in the Fil+ world is so that decisions can be nuanced in whatever way makes sense, and there is tolerance in the system for edge cases. This specific piece is what I disagree with:

[...] however, it has become evident that this approach has led to widespread abuse and is detrimental to the growth and success of the Filecoin network.

It has led to widespread abuse, but I think the net impact is closer to neutral positive at the moment rather than outwardly detrimental. Every system with competition where there is room for abuse will result in rational actors looking to abuse it. Every economy in the world faces this issue. We will continue to face this issue. We need to curb the abuse to a tolerable amount, i.e., < 5%. However, given the current scale of the program, the network, the implied slices and size of the pie, we have so much more to grow as well, that we need to deal with both reducing abuse while still scaling up to ensure Filecoin actually delivers value to the whole world (the whole of it, not just the slices that make sense to you or to me).

A great example is the CID-checker-tool that was released in Dec. A lot of that type of analysis was done by me and others in the community individually throughout 2022 and used to identify cases of potential abuse of the system for notaries applying in the Q2 election cycle. However, not setting policies in stone enabled us to identify edge cases and follow up with them, which then led to (1) behavior correction from clients that put them down a much safer path with replica distribution and (2) notaries that were trigger happy becoming significantly less so. This continues to be the case at a much larger scale now, thanks to the automation.

The core takeaway I'd like to push is - instead of setting hard rules and policies, we should be setting examples of what good looks like, and then measure each application against it. Anything that deviates substantially is worth digging deeper into. As the defacto entry point into onboarding data into Filecoin, this community gets everything from a first timer client who has no idea what deals even are all the way to the most sophisticated SP operation impersonating arbitrary companies around the world to onboard useless bits. We need to handle it all with grace. Not just for this program's sake, but for the network itself, all its stakeholders, and the rest of the web3/crypto/blockchain community.

However, we can and should get a lot more sophisticated about what we measure against. Your list is a great starting point. With that in mind, I would like to recommend that we take the list of defined expectations you have established, work with the notary and broader Fil+ community to finalize them, and then build tools (as the T&T WG has been doing for a narrower scope) that help keep DataCap applicants and their progress in check. I do have thoughts on the list you have proposed as well, but wanted to first align philosophically. Nothing I've said is "program policy" or set in stone, these are my opinions and I will stay open minded to the best of my ability.

Separately, the implication in the current status quo and what I stated above is that the network will continue to place a lot of trust with Notaries. The list of things you flagged as types/examples of abuse were not surprising to me. Lots of applications come in, but not everyone gets DataCap either, and not everyone should. I'd love to hear from notaries on things we can collect from clients or on behalf of clients that reduce the friction to making more accurate determinations on the trustworthiness of clients faster.

In light of this, we will also propose a new FIP (Filecoin Improvement Proposal) that would allow for the removal of datacap from entities who are proven to have obtained it fraudulently. This would prevent them from receiving the long term rewards associated with their actions and prevent them from continuing to harm the integrity of the network and the broader Filecoin community. It is essential that this type of behavior is not tolerated and that the network is protected from those who seek to exploit it for their own gain.

Definitely in favor of exploring this conversation. Economics seem complex, but lets work through it 🦾 and see what is reasonable.

herrehesse commented 1 year ago

@dkkapur Thank you for your detailed response to our proposal. Although we don't agree on every point, we share common goals and have a clear understanding of the issues at hand. We would appreciate if you could give your opinion on our most recent update, which can be found on our Slack channel.

https://filecoinproject.slack.com/archives/C01DLAPKDGX/p1674206285725749

Effective immediately, while investigations into potential non-compliance by storage providers and notaries are ongoing, I am requesting the direct implementation of certain requirements for participation in the Filecoin+ program.

While we may hope for individuals to exhibit positive behaviour, there is no guarantee that this will prevent misuse. Instead, let's establish clear guidelines for the program and focus on verifiable facts rather than relying on trust.

Let's minimise trust as much as possible and rely on verifiable facts and proof.

dkkapur commented 1 year ago

ACK - posted a response in Slack. @herrehesse can we pick one location to continue the conversation? it seems to me like it is still at a Discussion phase, at which point, I think it makes sense to continue either in Slack or in a Discussion topic, and return to this Issue with a summary before the next governance call or major action.