filecoin-project / Allocator-Governance

7 stars 32 forks source link

Community Diligence Review of IPFSTT Allocator #9

Closed filecoin-watchdog closed 3 months ago

filecoin-watchdog commented 4 months ago

Review of Top Value Allocations from @nicelove666 Allocator Application: https://github.com/filecoin-project/notary-governance/issues/1006

First example: DataCap was given to: https://github.com/nicelove666/Allocator-Pathway-IPFSTT/issues/3

This client was given 1PiB (100% of requested weekly amount) of DataCap, instead of 50% of their request as stated in their allocator application. The Allocator later claimed it was a mistake and asked for DataCap to be removed. Asking for more details from @galen-mcandrew on removal

filecoin-watchdog commented 4 months ago

Second example: https://github.com/nicelove666/Allocator-Pathway-IPFSTT/issues/10 Public Open Dataset - key compliance requirement: Retrievability

1st point) This time the allocator did better follow their allocation schedule, giving 10% of the amount requested (100TiB). Their application said they would give 50% so they gave significantly less.

2nd point) No sign of KYC or KYB of client or dataset as mentioned in allocator application

3rd point) Client said these were the SPs f02951064, ZheJiang-JinHua f02984282, HeNan-ZhengZhou f0122215, ShanDong-QingDao f0427989, ShanDong-QingDao f02942808, GuangDong-ShenZhen f02894875, JiangXi-FuZhou

Actual data storage report: https://check.allocator.tech/report/Origin-Storage-IO/future-storage/issues/5/1715259157530.md

Provider | Location | Total Deals Sealed | Percentage | Unique Data | Duplicate Deals f02984282 | Zhengzhou, Henan, CNChina Mobile Communications Group Co., Ltd. | 37.50 TiB | 1.97% | 37.50 TiB | 0.00% f02894875 | Nanchang, Jiangxi, CNCHINA UNICOM China169 Backbone | 647.97 TiB | 34.12% | 647.06 TiB | 0.14% f0427989 | Qingdao, Shandong, CNCHINA UNICOM China169 Backbone | 329.13 TiB | 17.33% | 329.13 TiB | 0.00% f02200472 | Chengdu, Sichuan, CNCHINANET SiChuan Telecom Internet Data Center | 35.22 TiB | 1.85% | 35.22 TiB | 0.00% f02942808 | Jiangmen, Guangdong, CNCHINANET-BACKBONE | 303.41 TiB | 15.98% | 303.41 TiB | 0.00% f02951064 | Hangzhou, Zhejiang, CNJINHUA, ZHEJIANG Province, P.R.China. | 141.81 TiB | 7.47% | 141.81 TiB | 0.00% f01025366 | Qingdao, Shandong, CNQingdao, Shandong Province, P.R.China. | 404.00 TiB | 21.27% | 404.00 TiB | 0.00%

Most SP IDs taking deals did match per report. f01025366 with 21% of data storage was not on the original list

Additional diligence needed to confirm entity and actual storage locations - but seems allocator did initial review of SPs and asked about retrievals, however never followed up.

4th point) Second, third, fourth allocation awarded to this client. However, per Spark dashboard, all SPs are either not available or have 0% retrievability.

The Allocator showed no sign of diligence after 1st allocation and gave the 2nd, 3rd, 4th allocation of 3.6PiB to the client.

nicelove666 commented 4 months ago

First, regarding the issue with 1PiB, I have already provided an explanation. in v3.1, if a customer requested 1PiB as their weekly allocation, they would only receive 512TiB. I believe that even in V5, if a client requests 1PiB, they would still only receive 512TiB. Furthermore, at that time, the signature website was unable to modify the data volume. To address this issue, we requested the closure of the 1PiB allocation and in the subsequent three allocations, only approved 100T, 50T,1T.

WX20240512-103127@2x

Second, regarding the geographical location issue, it appears that 60% of the address locations are completely accurate, while the remaining 40% have some errors. However, the country and province where the service provider's geographical location is located are correct. It is possible that there were minor relocations of the data centers due to policy or cost reasons. I believe this can be understood.

Third, regarding the addition of the service provider (SP), the customer has already provided prior notification on GitHub.

WX20240512-110600@2x

Fourth, regarding due diligence, we have conducted multiple rounds of due diligence, especially after the introduction of the checkbot. We only proceeded to the next step after confirming that the customer's SP and retrieval were in order.

WX20240512-110625@2x WX20240512-110639@2x WX20240512-110659@2x WX20240512-110713@2x
nicelove666 commented 4 months ago

Fifth, regarding retrieval: boost retrieve -p=f02894875 --o=f02894875_fetch.car bafykbzaced34st7aerg454mq4hcggxmkkbsltow3a 未命名文件 wrssuqjhl2egz3mgri5u

boost retrieve -p=f03035656 --o=f03035656_fetch.car bafykbzaceb6fuqbaagd3l26inhlguqeubcd5yx2nmupqyq22kptutaus5aglg
b1cd860b5ccf222b7962eb65913abcae

boost retrieve -p=f0427989 --o=f0427989_fetch.car bafykbzaceb45tg4xp7rof4ys4xczkfuqrrkcxeguw3lxssxs3qv6vdp5avphe e9ef874d6eaa128ecf55609835f22982

boost retrieve -p=f01025366 --o=f01025366_fetch.car bafykbzacebxnhqluviml2tgj5kj3uo3nr3zyxqwavaqs 546d41ee8529bf806635810149e6a537 lqo3ki4o6cdasrcjk

boost retrieve -p=f02951213 --o=f02951213_fetch.car bafykbzacec5f5a5rdt35wm6derns2r5xj22wxttqzqt54ci7igw33rvo6r6ec bf9d0aadb6d4420c63a16b02d83a3657

boost retrieve -p=f03074163 --o=f03074163_fetch.car bafykbzacedmsx26kcfuminth5rlgvhxkyooabzvxtuh5xykrfefiusvmnkgfs 264c6a83b0dcbaba85138f6deae28270

Finally, we allocated the 5PiB DC to 4 allocators, including 3 companies and 1 individual. These are sufficiently diverse and comply with our allocation rules.

nicelove666 commented 3 months ago

We attended three notary meetings in a row and spoke twice in a row, Some SPs can be successfully retrieved on Spark. We hope to contact the Spark team for more solutions.

nicelove666 commented 3 months ago

https://github.com/filecoin-station/spark/issues/74

galen-mcandrew commented 3 months ago

Based on an additional compliance review, it appears this allocator is attempting to work with public open dataset clients.

However, the data associated with this pathway is not currently able to be retrieved at scale, and testing for retrieval is currently noncompliant.

As a reminder, the allocator team is responsible for verifying, supporting, and intervening with their clients. If a client is NOT providing accurate deal-making info (such as incomplete or inaccurate SP details) or making deals with noncompliant unretrievable SPs, then the allocator needs to intervene and require client updates before more DataCap should be awarded.

Before we will submit a request for more DataCap to this allocator, please verify that you will instruct, support, and require your clients to work with retrievable storage providers.

@nicelove666 can you verify that you will enforce retrievability requirements, such as through Spark? Please reply here with acknowledgement and any additional details for our review.

nicelove666 commented 3 months ago

Dear Galen @galen-mcandrew ,

We confirm that we will guide, support, and require our clients to collaborate with retrievable SP to ensure their successful data storage and retrieval.

Our allocator supports three main types of applications: public datasets, enterprise clients, and individual clients. In the first round, we received applications from individuals and public datasets, and in the next round, we will support applications from enterprise clients.

Due to the early launch of our allocator, before Spark was introduced, we primarily used Boost and Lassie for data retrieval, and the retrieved data looked good at the time. However, after the emergence of Spark, we found that the success rate on this platform was relatively low. We have communicated with the Spark team (https://github.com/filecoin-station/spark/issues/74) and are actively seeking solutions.

Going forward, we will focus on data retrieval and require clients to provide more accurate transaction information. If the transaction information is inaccurate, we will intervene and request the client to update it. Only after the information is updated will we consider granting more DataCap.

I will closely monitor the SP's disclosures and retrieval situation. If you have any guidance, please contact us at any time.

nicelove666 commented 3 months ago
WX20240611-094208@2x
galen-mcandrew commented 3 months ago

Thank you for the confirmation and update! Looking forward to seeing the continued diligence & onboarding through this pathway.

We will be requesting 10PiB of DataCap for this allocator to increase runway and scale.

nicelove666 commented 3 months ago

Hey, dear Galen, thanks for your reply, support, and help! Through our efforts, the number of SPs supporting Spark has increased by one. Currently, there are two SPs supporting Spark, f02951213 and f02894875. Moving forward, we will focus on the retrieval functionality of SPs on Spark, and I believe we can do a good job! Our communication with Spark will mainly be reflected here. If there is important content in the phone communication, we will also synchronize it.https://github.com/filecoin-station/spark/issues/74

filecoin-watchdog commented 3 months ago

@nicelove666 looks like you gave a brand new github ID 1.75PiBs over 4 days with no retrievals on most SPs after each allocation. https://github.com/nicelove666/Allocator-Pathway-IPFSTT/issues/27

can you explain why you keep giving DataCap to non retrieval SPs? cc @galen-mcandrew

nicelove666 commented 3 months ago

@filecoin-watchdog It's been two years, and you've left messages on nearly all of our LDNs (over 20 LDNs). It seems you are particularly concerned about us. Thanks for your attention. Given the limitations of written expression, we can have an in-depth discussion at next week's meeting.

  1. The 1.75P was released in three rounds: 256T in the first round, 512T in the second round, and 1P in the third round. Below is the Allocation Tranche Schedule we submitted when applying, which we are strictly adhering to.

    WX20240620-175108@2x
  2. Before approving the next round, we reviewed the client data, and everything looks good. This includes the disclosed SPs and the final cooperating SPs being the same, with each SP's share not exceeding 25%. We manually checked the retrieval rates of the SPs, and the data stands up to scrutiny. The SPs do not use VPNs and are distributed across two major continents.

  3. All SPs support Boost and Lassie retrieval. You are welcome to query our CIDs on filecoin.tools, and the retrieval by our cooperating SPs will return correctly, proving that they all support retrieval.

  4. Some SPs support Spark retrieval, and you can see the data. The data for other SPs is in the process of being improved. Regarding the Spark, we have had detailed discussions on Slack, and we are waiting your data about Spark.

Finally, we are also actively following you. This round 10P, we used 1.75P, you used 1.95P. Could you please explain how your 1.95P was allocated?

  1. We did not see the update about your 1.95P data here at https://github.com/cryptowhizzard/Fil-A-2/issues. Can you explain it? @filecoin-watchdog @galen-mcandrew WX20240620-181458@2x