Open martplo opened 1 week ago
@martplo Allocator Application Compliance Report
4.95 PiB granted to clients: | Client Name | DC |
---|---|---|
webeye.io | 50TiB | |
zinc15 | 50 TiB | |
VshareCloud | 50 TiB | |
K12 International Education Platform | 100 TiB | |
w3s and the users of our tools | 2PiB | |
CommonCrawl | 250TiB | |
OpenDataLab | 2.5PiB |
Example 1 - webeye.io KYB and KYC was performed. This looks like one of the older applications, as it might be tracked back to fil+ large datasets https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2328 I assume this is the same application. If this was a known client, though, why didn’t the Allocator follow their own rules and didn’t grant an initial 5%, but started with 50TiBs? The application was closed due to no response from the client.
Example 2 - zinc15 This application seems to be stored several times on the filecoin: https://github.com/search?q=repo%3Afilecoin-project%2Ffilecoin-plus-large-datasets+zinc3d&type=issues Has the allocator clarified this with the client before starting cooperating with them? KYC was performed, and additional questions were asked.
Example 3 - VshareCloud A lot of additional questions were asked before first allocation was granted. KYC was performed. The application was closed after the initial 50TiB because it was determined that the client did not fit into this allocator. There is no CID report to analyse.
Example 4 - K12 International Education Platform The client requested 5 PiB and declared 4 data replicas of 512TiB each. With this dataset size, 2 PiB of data should be enough, but the allocator didn’t raise it. The first allocation of 50TiB showed that the user was not compliant with the rules. Why was another 50 TiB granted? What would that prove? Also, it looks like this dataset was stored on the filecoin before and the same client did it: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2283
Example 5 - w3s and the users of our tools The client requested 2 PiB and declared 10 data replicas of 600TiB each. With this dataset size, a minimum of 6 PiB of data should be requested, but the allocator didn’t raise it.
Also, search for data sample link in the large dataset repo returns 5 records with very similar applications: https://github.com/search?q=repo%3Afilecoin-project%2Ffilecoin-plus-large-datasets+bafybeid5jpdqzlb4tqsd6peoa7qstoxat3ovsg62wutyp4gnzqbqsggfsq&type=issues
KYC was performed, yet no additional questions were asked—no questions on data preparation or the SPs list.
Most SPs have good retrieval rates, yet 2 out of 8 SPs have them below 1%. Also, CID sharing occurred, and the allocator didn’t bring that up.
Example 6 - CommonCrawl The allocator asked many additional questions. The process of resolving the application details seems very thorough.
In his application, the allocator said the DC allocation process would be 5/15/30/50%. However, in this client's case, the first three allocations are 50TiB, 50TiB, and 150TiB, which do not fit the declared rules. Where did these changes come from?
2 out of 9 SPs have a retrieval rate of 0%, 1 has less than 75%, and the rest look good.
Example 7 - OpenDataLab The allocator asked additional questions, clarified inconsistencies on an ongoing basis, and conducted frequent reports.
3 out of 8 SPs have a retrieval rate below 75%.
Overall, this allocator asks many questions, conducts thorough customer analysis, and runs regular CID reports.
Thank you for the review.
Application: v5 Notary Allocator Application: Open Public Dataset Pathway Latest compliance report: Compliance Report - 2024-10-14 01:22:58
List of clients (chronologically):
web3eye.io After the initial 50TiB was granted, we closed the application due to the client's inactivity. Total DC granted: 50TiB
web3.storage The client asked for 2PiB, which was granted. Total DC granted: 2PiB
K12 International Education Platform After the initial 50TiB, issues were pointed out to the client, who agreed to improve. After another test 50TiB allocation, the client was proven not diligent, and the application was closed. Total DC granted: 100TiB
zinc15 After the initial allocation of 50TiB, we asked the customer additional questions about the dataset due to the introduction of enhanced data verification policies. The customer never responded to the questions and was uncooperative, so the application was not renewed. Total DC granted: 50TiB
VshareCloud After the initial 50TiB, a discussion with the client was raised, where we established that this client didn't meet the criteria for the open pathway. After further clarification, we decided not to extend the cooperation with complete understanding from the client. Total DC granted: 50TiB
Common Crawl This is an active client we are working with. Total DC granted: 250TiB
OpendataLab This is an active client we are working with. Total DC granted: 2.5PiB
For each client, a thorough analysis is carried out and additional questions are asked to clarify inconsistencies in the application and to detail the data preparation process, explaining the level of customer orientation. Each client goes through the KYC process using kyc.allocator.tech. Non-compliant applications are rejected and closed (a few examples: 1, 2, 3) After positive verification, clients receive allocations according to the allocator application (5%, 15%, 30%, 50% of DC). During cooperation with clients, the retrieval rate is regularly checked by conducting CID reports in which the locations of SPs are analyzed on an ongoing basis, the list of SPs used by the client is compared with the list of SPs provided in the form, and any discrepancies are clarified with the clients on an ongoing basis.