Community Diligence Review of RFfil Allocator

filecoin-watchdog commented 6 months ago

Review of Top Value Allocations from @MikeH1999 Allocator Application: https://github.com/filecoin-project/notary-governance/issues/1054

First example: DataCap was given to: https://github.com/MikeH1999/RFfil/issues/3

1st point) 1PiB of datacap was given to the client. The allocator stated in their application they would give 50 TiBs as a first allocation. The allocator claimed it was a mistake and asked for DataCap to be removed. tbd @galen-mcandrew can determine further.

It looks like most of the Datacap was used in deals: https://check.allocator.tech/report/MikeH1999/RFfil/issues/3/1715343842408.md

filecoin-watchdog commented 6 months ago

Second example: DataCap was given to: https://github.com/MikeH1999/RFfil/issues/6

1st point) Allocation schedule was 50 Tibs and then 500TiB for 2nd, 3rd, 4th, 5th allocation. Allocator followed tranche schedule as stated.

2nd point) The allocator appears to have completed KYC or KYB of client or dataset via email screen shots in the application.

3rd point) Client said these were the SPs, no entity. No comment from Allocator about this f02519100 GuangZhou f02215209 Californial f02770938 HongKong f02093396 Singapore f02259777 Singapore

Actual data storage report: https://check.allocator.tech/report/MikeH1999/RFfil/issues/6/1715342901387.md

Provider | Location | Total Deals Sealed | Percentage | Unique Data | Duplicate Deals f02237006 | Singapore, Singapore, SGAlibaba (US) Technology Co., Ltd. | 414.44 TiB | 48.74% | 414.44 TiB | 0.00% f03034045 | Tokyo, Tokyo, JPMULTACOM CORPORATION | 448.00 GiB | 0.05% | 448.00 GiB | 0.00% f02215209 | Los Angeles, California, USNetLab Global | 434.66 TiB | 51.12% | 434.66 TiB | 0.00% f02519100 | Hong Kong, Hong Kong, HKOneAsia Network Limited | 672.00 GiB | 0.08% | 672.00 GiB | 0.00% f02770938 | Hong Kong, Hong Kong, HKOneAsia Network Limited | 160.00 GiB | 0.02% | 160.00 GiB | 0.00%

Three SP IDs taking deals matches per report. All others were not reported and with unknown details. Additional diligence needed to confirm entity and actual storage locations

3rd point) 5 allocations awarded to this client. However, per Spark dashboard, all SPs are either not available or have 0% retrievability.

The Allocator showed no sign of diligence after 1st allocation and gave the 2nd, 3rd, 4th, 5th allocation of 2PiB to the client.

MikeH1999 commented 6 months ago

I'll explain briefly，as I understand it, the stfil issue has resulted in a large number of SPs being left unsupervised, so they terminated all sectors, like f02519100, f02770938,f03034045

MikeH1999 commented 6 months ago

If you have any other questions, please leave a message

Kevin-FF-USA commented 5 months ago

Hi @MikeH1999

On the next Fil+ Allocator meeting we will be going over each refill application. Wanted to ensure you were tracking the review discussion taking place in https://github.com/filecoin-project/Allocator-Governance/issues/11

If your schedule allows, recommend coming to the May 28th meeting to answer/discuss the issues raised in the recent distributions. This will allow you to faster address - or, the issue in Allocator Governance for ongoing written discussion.

Warmly, -Kevin https://calendar.google.com/calendar/embed?src=c_k1gkfoom17g0j8c6bam6uf43j0%40group.calendar.google.com&ctz=America%2FLos_Angeles

MikeH1999 commented 5 months ago

Reviewing the contents of the last meeting and the information I have received here, I would like to make a brief explanation:

However, per Spark dashboard, all SPs are either not available or have 0% retrievability.

This is because the stfil team was arrested by the Chinese police, resulting in a large number of SPs managed by their team being left unmaintained, and no appropriate technical team could be found to take over for a short period of time, so a large number of SPs owners were forced to terminate the sector，like f02519100, f02770938,f03034045,f02237006, Some find a technical team to take over, but their technical solutions are different, which ultimately makes the data unretrievable

The Allocator showed no sign of diligence after 1st allocation and gave the 2nd, 3rd, 4th, 5th allocation of 2PiB to the client.

Regarding this question, since at that time the CID checking bot was still under development, I had a third-party website to check the data allocation before assigning it, Check out these reviews https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2033390727 https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2041287854 and after assigning it, I would retrieve the data randomly, Check out these reviews https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2028622791

Since joining the Allocator, I've been actively following the meetings and the tools.A number of bugs in the tool were raised, as follows https://github.com/fidlabs/allocator-tooling/issues/54 https://github.com/fidlabs/allocator-tooling/issues/50 https://github.com/fidlabs/allocator-tooling/issues/47 https://github.com/fidlabs/allocator-tooling/issues/24 https://github.com/fidlabs/allocator-tooling/issues/23

If there's anything we're doing wrong or need to improve, feel free to leave a comment, thanks! @filecoin-watchdog @Kevin-FF-USA @galen-mcandrew

galen-mcandrew commented 5 months ago

Based on a further diligence review, this allocator pathway is partially in compliance with their application.

Specifically:

Mixed evidence of diligence with clients (no verification of client claims)
Subsequent allocations given despite noncompliant client deal-making, with minimal allocator intervention through comments
No retrievability for datasets, despite claims of public open data by both allocator and client (not showing distributed network data storage utility)

Given this mixed review, we are requesting that the allocator verify that they will uphold all aspects & requirements of their initial application. If so, we will request an additional 2.5PiB of DataCap from RKH, to allow this allocator to show increased diligence and alignment.

@MikeH1999 can you verify that you will enforce program and allocator requirements? (for example: public diligence, tranche schedules, and public scale retrievability like Spark). Please reply here with acknowledgement and any additional details for our review.

MikeH1999 commented 5 months ago

@galen-mcandrew Thank you for reviewing

I've been following my plan for auditing as well as datacap issuance, here are some of my notes

since at that time the CID checking bot was still under development, I had a third-party website to check the data allocation before assigning it, Check out these reviews https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2033390727 https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2041287854 and after assigning it, I would retrieve the data randomly, Check out these reviews https://github.com/MikeH1999/RFfil/issues/6#issuecomment-2028622791

Currently found due to the spark technical team some documents and functionality of the missing, resulting in a lot of sp in the Spark dashboard retrieval rate of 0, I see that there are teams in and spark docking to deal with this problem!

If there are any other questions, please leave a message

MikeH1999 commented 5 months ago

https://github.com/MikeH1999/RFfil/issues/31

This is a new client of mine that has asked its SP to support spark before reviewing for the next round

MikeH1999 commented 5 months ago

@galen-mcandrew Hi Are there any other questions ？

MikeH1999 commented 5 months ago

MikeH1999/RFfil#31

This is a new client of mine that has asked its SP to support spark before reviewing for the next round

This client is using the venus team's tool, and the vunus team will be updating the tool to support spark in 1 month! https://github.com/ipfs-force-community/droplet/issues/530

So for the next round I'll see that the retrieval rate improves before approving it

MikeH1999 commented 4 months ago

@galen-mcandrew Hi, some issues I recently found while researching how to support spark retrieval, a quick note here

Currently there is no clear documentation on how to configure it for supporting spark retrieval, so many teams are working on it
In my common with the spark team, I learned that spark's current retrieval feature does not support V5 data, only V4 or earlier. Here are some proofs

https://github.com/space-meridian/roadmap/issues/115

https://filecoinproject.slack.com/archives/C03S6LXSRB8/p1719330127516309?thread_ts=1719325736.366669&cid=C03S6LXSRB8

If that's the case, why do v5 approvals need to be based on retrieved data from spark?

galen-mcandrew commented 4 months ago

It sounds like the major distinction here is whether the clients/data preparers and SPs are using the previous market actor (what you refer to as v4?) or a new/different DDO (v5?).

If there is a different scale tool being built that measures and reports retrieval for DDO deal-making, we would love to see it and work in include that in our compliance reviews!

MikeH1999 commented 4 months ago

@galen-mcandrew @Kevin-FF-USA One more thing. The spark tech team has published a list of datacap wallets that currently support spark retrieval，I didn't find any LDN wallets under any of the allocators in this list, which also suggests that retrieval under the allocator path is not supported for the time being.

Currently, testing has shown that the wallet mentioned in the list above has stored some data. As of now, f02822222, you can see Successful Retrievals.

Here's a transcript of the communication at slack

So if it's a new miner who's sealing data under the allocator path, even though his boost is configured correctly, Successful retrievals are not currently visible on the Spark public dashboard.

Thanks for checking it out!

patrickwoodhead commented 4 months ago

To give more info about why it might take us some time to get Spark to be compatible with v5 allocator pathways and DDO.

Previously, we were able to get the list of FIL+ LDN datacap clients from the datacapstats API and then use this list to filter all Storage Market deals in f05 to get the FIL+ LDN deals to test. We were then able to use the label field in these deals to get hold of payload CIDs which Spark uses to make the retrieval tests.

With DDO, the label field is no longer giving us the payload CIDs that we can test.

Our current top priority is to work on how we can learn the payload CIDs inside a Piece, so that we can make retrievals for data in new deals. One way of doing this is to build a reverse IPNI lookup API that gives you payload CIDs for a given Piece CID. This work item requires work and coordination across Boost, IPNI and Spark and is in the planning and funding phase.

Another interim way that we are exploring to achieve this is for Spark to listen to the gossip of IPNI advertisements itself and pipe the relevant records into the Spark task DB.

We also need to set Spark up to listen to the DDO actor events. This will make Spark update in realtime rather than the new deals added to the task list by a batch process.

We will be more clear with the exact timelines once we have more clarity on the effort needed to get payload CIDs from new deal into the Spark tasking DB.

filecoin-project / Allocator-Governance

Community Diligence Review of RFfil Allocator #11