Modification: AC Bot - Thresholds to Enforce Compliance

ghost commented 10 months ago

This is a continuation from the earlier initial proposal in #976 concerning the implementation of the Aggregation and Compliance bot. This bot is designed to compile data from different data sources and enforce guidelines. Based on the tests we have conducted, this proposal outlines more refined thresholds for the AC bot to make determinations on DataCap removal

Test process

To simulate the AC bot's effect on the Fil+ process, we assessed how each client performed against each metric highlighted in the previous proposal. We then visualized the data using histograms to display the distribution of client scores across each metric. Based on insights from these histograms, we started an internal Fil+ governance team discussion to determine the appropriate thresholds. The subsequent sections outline the finalized thresholds, the resulting data in that sequence.

Thresholds

Here are the proposed thresholds schedules for DataCap removal that will begin once the bot is deployed and will gradually be tightened over a 8-week period. This incremental approach will allow clients to adapt to the chaning compliance standards. Failing these thresholds will result in automatic creation of a DataCap removal proposal.

metric	Week 1-2	Week 3-4	Week 5-6	Week 7-8
CID-checker score	>25%	>50%	>75%	>95%
Retrieval Bot score	0%	>10%	>25%	>75%
Claimed SP count	>0	>0	>0	>5
Actual SP count after fourth allocation	>5	>5	>5	>5
percent of actual SPs in claimed list	>0%	>0%	>0%	>100%
KYC Check	0	0	0	1
SP unique locations	1	1	3	4
Percent of properly replicated data	>0%	>0%	>25%	>75%
Max percent data stored by top provider	<75%	<50%	<40%	<35%
Shared data percent	<20%	<5%	0%	0%

For the metrics "Claimed SP Count," "Claimed SP Count," and "Actual SP Count," we currently lack sufficient data to enforce immediate thresholds. As a result, we suggest a 6-week data collection phase to gather the necessary insights. More about this is described in the "Test Results and Rationale" section.

Next steps and call to arms

If there are no objections and we obtain the required support for implementing the AC bot with these thresholds, we will proceed with GitHub integration and deployment. Please note that the capability for DataCap removal is contingent upon the functionality being developed in line with this proposal

If you support this implementation please indicate so in the comment section below. For those who believe there should be ammendments to the proposal, we welcome your input in comments as well. Additionally, you're welcome to reach out to any member of the Fil+ team or directly to @philippe pangestu on Slack for further discussions.

Test Results and Rationale

The testing phase involved data analysis of 229 client addresses with active GitHub applications, as well as the subset of 116 clients who have initiated their applications within the past three months. The following graphs describe how client addresses have scored against each of the metrics:

X axis indicates the number of client addresses that fall below the score
Y axis is the score on that metric

CID-checker score

Screen Shot 2023-11-02 at 3 00 06 PM

Description: This score represents the percentage of CIDs that a client has engaged with, which have not been marked by the CID-checker in any of its three flags. This is the same statistic currently visible in the quality.datacapstats.io dashboard.
Checked at: Second allocation
Threshold Schedule: 25% -> 50% -> 75% -> 95%
Rationale:
- The proposed schedule mainly impacts only 88 out of 200 client addresses with scores below 95%.
- During the initial two-week period, only 17 client addresses will fail the 25% threshold. Seven of which were opened in the past three months
- We opted for a slightly more aggresive schedule, as we're confident that clients can elevate their performance to meet these standards.Although there are 88 client addresses currently scoring below 95%, only 35 such addresses exist when looking at the past three months. This suggests noticeable improvement in that subset.

Retrievability bot score

Screen Shot 2023-11-02 at 3 05 04 PM

Description: This metric indicates the percentage of sampled CIDs that were marked as retrievable by the retrievability bot. The score is calculated using the highest value among HTTP, Graphsync, or Bitswap protocol sampling. This is the same statistic currently visible in the quality.datacapstats.io dashboard.
Checked at: Second allocation
Threshold Schedule: 0% -> 10% -> 25% -> 75%
Rationale:
- Our data indicates that a limited number of CIDs are currently retrievable, which is why we're providing a two-week grace period for SPs to update their hardware. The schedule begins conservatively, aiming for 10% retrievability within the next two weeks.
- We anticipate that CIDs will become retrievable in batches, explaining the steep jump in the latter part of the schedule from 25% to 75%.
- We are optimistic about the potential for improvement, as evidenced by the notable differences between the two graphs provided above.

Claimed SP count

Screen Shot 2023-11-02 at 3 06 15 PM

Description: This metric represents the number of distinct SPs that a client has claimed to be working with in their initial application. It's worth noting that the graph above may not be fully representative, as the question asking clients to list their SPs was only recently introduced. To date, only 36 applications contain this data.
Threshold Schedule: 0 -> 0 -> 0 -> 5
Checked at: application submission
Rationale:
- Given that the question about listing SPs was only recently added to the application, we lack sufficient data to enforce initial thresholds.
- For this reason, we suggest a 6-week grace period for clients with existing applications to update and provide a comprehensive list of SPs. Subsequently, we will verify if the client is indeed engaging with SPs listed in this set.
- Our data shows that after the fourth allocation, only 3 client addresses are involved with fewer than 5 SPs. As a result, we plan to sharply raise the threshold to 5 SPs following the 6-week period, a goal we believe is attainable.
- This 6-week period will also serve to inform new clients to properly provide a complete list of SPs.

Actual SP count after fourth allocation

Screen Shot 2023-11-02 at 3 07 06 PM

Description: This metric represents the number of unique SPs with whom the client has finalized deals by the time of the fourth allocation in their application.
Threshold Schedule: 5 -> 5 -> 5 -> 5
Checked at: fourth allocation
Rationale:
- Our data indicates that only 3 client addresses engage with fewer than 5 SPs by the fourth allocation. In light of this, we've opted for an aggressive approach, setting the threshold high from the beginning of the schedule.

percent of actual SPs in claimed list

Screen Shot 2023-11-02 at 3 07 43 PM

Description: This refers to the percentage of SPs that the client has actually engaged with, relative to those listed in their initial claimed set of SPs.
Threshold Schedule: 0% -> 0% -> 0% -> 100%
Checked at: fourth allocation
Rationale:
- For existing clients, we are providing a 6-week window to submit an updated list of SPs. Assuming they comply accurately, meeting this threshold should not pose a challenge.
- Beginning with the fourth allocation, new clients will be evaluated based on their original list of claimed SPs.

KYC check

Screen Shot 2023-11-02 at 3 08 08 PM

Description: This is a binary metric to assess whether a client has completed the KYC process via the toggle API here
Threshold Schedule: 0 -> 0 -> 0 -> 1
Checked at: application submission
Rationale:
- Currently, only 25 client addresses with active applications have completed the KYC process. For this reason, we're introducing a 6-week grace period to allow clients sufficient time to fulfill this requirement.

SP unique locations

Screen Shot 2023-11-02 at 3 08 48 PM

Description: This metric indicates the number of distinct geographic locations among a client's set of SPs, as measured through provider-quest
Threshold Schedule: 1 -> 1 -> 3 -> 4
Checked at: fourth allocation
Rationale:
- We find the data trends promising as only 36 clients currently store with SPs in less than 2 unique locations, and this number decreases to 15 clients when examining the past three months.
- While establishing connections with new SPs in different locations can be challenging, we do offer a variety of SP locations within the Fil+ program. Therefore, we have decided on a 4-week grace period before a modest increment from 3 to 4 locations, acknowledging the time required for onboarding.

Percent of properly replicated data

Screen Shot 2023-11-02 at 3 09 05 PM

Description: This represents the proportion of CIDs stored by the client that are properly replicated across a minimum of 5 unique SP IDs.
Threshold Schedule: 0% -> 0% -> 25% -> 75%
Checked at: fourth allocation
Rationale:
- Initial data indicates that many clients are falling short in this area; 100 out of 228 clients with active applications have 0% of their data properly replicated. Given this, we have opted for a 2-week grace period for clients to adapt adapt.
- Conversely, by the fourth allocation cycle, most client addresses are working with at least 5 SPs. We believe that through CID redistribution in batches, this metric can improve significantly, hence the sharp threshold increase from 25% to 75% post-grace period.

Max percent data stored by top provider

Screen Shot 2023-11-02 at 3 09 38 PM

Description: The percentage of DataCap used in deals with the client's top-storing provider relative to the client's total received DataCap.
Threshold Schedule: 75% -> 50% -> 40% -> 35%
Checked at: fourth allocation
Rationale:
- Only 22 out of 116 open applications in the past three months and 46 out of 228 total open applications have more than 50% of their data stored with a single SP.
- We believe it would not be too challenging to lower this percentage to 50% within the first four weeks, as it would impact only 46 existing applications.
- However, we recognize that lowering the threshold below 50% presents greater challenges due to hardware and DataCap limitations. Many clients hover just below the 50% mark. Consequently, we propose a gradual reduction in the threshold from 40% to 35% over the next four weeks.

Shared data percent

Screen Shot 2023-11-02 at 3 12 21 PM

Description: The percentage of CIDs stored by the client that is shared with other clients.
Threshold Schedule: 20% -> 5% -> 0% -> 0%
Checked at: first allocation
Rationale:
- The data strongly supports our optimism for this metric, as only 28 out of 228 client addresses currently share any CIDs with other clients. Thus, we propose a sharp decrease in the threshold.
- Initially, only nine clients would be impacted by the 20% threshold. By the conclusion of the four-week period, we anticipate that CID-sharing could be virtually eliminated.

kernelogic commented 10 months ago

It looks like a good approach and considered actual scenarios. However I just have one question, how to understand this?:

Shared data percent <20%

Share with what? Other LDNs from other people? Or the same dataset from a series of LDNs? How to reduce it to 0% after it's there? By terminating sectors?

As we know same dataset prepared by different DPs using the same software, e.g. Singularity, can possibly produce same cars.

spaceT9 commented 10 months ago

Great proposal, more permissionless!

nicelove666 commented 10 months ago

Agree

nicelove666 commented 10 months ago

Currently the bot is deploying a new version, the bot stopped working, it seems like it's been 2 weeks, looking forward to the bot getting back to growing.

cryptoAmandaL commented 10 months ago

Currently the bot is deploying a new version, the bot stopped working, it seems like it's been 2 weeks, looking forward to the bot getting back to growing.

So, has the issue with the bot been resolved?

nicelove666 commented 10 months ago

目前机器人正在部署新版本，机器人停止工作了，好像已经两周了，期待机器人恢复生长。

那么，机器人的问题解决了吗？

No, the bot is still not working. It seems that the ac robot needs to be online to resume operations.

filecoin-project / notary-governance