Closed Lind111 closed 1 month ago
@Kevin-FF-USA @galen-mcandrew Hello, can you see this?
Allocator report: https://compliance.allocator.tech/report/f03014490/1722216245/report.md
After communicating with Mike and Miroslav Bajtoš about the reason for the low sp retrieval rate, the sp has been contacted to change the relevant configuration,Let's look forward to it.
2.5PiBs for second round allocation from gov team.
2 PiBs given to https://github.com/Lind111/EF/issues/14
5 SPs provided upfront https://github.com/Lind111/EF/issues/14#issuecomment-2136784704
Showing 10 SPs taking deals - client asked for 7 copies - why more and more SPs? https://check.allocator.tech/report/Lind111/EF/issues/14/1722850373132.md
Retrievals continue low.
@Lind111 there is no clear information about Data Preparation and retrievability in any applicant information. Can you push clients to provide details for more prep information from these questions?
What is the transformation from the files available for download and what will be stored on filecoin?
How when we sample your deals will we be able to confirm that it has come from the dataset?
How the data is transformed into deals for filecoin.
When a deal is sampled for verification, how will we be able to confirm that it is part of this dataset? (how is it chunked into car files?)
Given a 32GB payload, what steps can an independent entity take to confirm it comes from the relevant upstream dataset?
We have successfully seen datasets be prepared for filecoin, ranging from internet archive, to wikipedia, to the chain itself. You can either store the metadata structure above the individual file chunks as is done by e.g. web3 storage, or you can have a separate well-advertised metadata layer via e.g. a website.
We want to see how a client could be able to make use of this dataset. Can you share details?
this could be a client script for how to iterate through / process over the data this could be a web site allowing browsing / identification of specific pieces of data from the dataset as stored this could be identification of clients making use of the data
5 SPs provided upfront Lind111/EF#14 (comment)
Showing 10 SPs taking deals - client asked for 7 copies
I've asked the client, but he hasn't responded yet. https://github.com/Lind111/EF/issues/14#issuecomment-2273104076
why more and more SPs? https://check.allocator.tech/report/Lind111/EF/issues/14/1722850373132.md
In this issue, I declare in advance, when it is necessary to add a new SP must be announced in advance, the following is the client is in the comments inside the SP announced in advance of the situation
Retrievals continue low.
Below is the response made by the client and I will keep an eye on the retrieval rate of this client
@Lind111 there is no clear information about Data Preparation and retrievability in any applicant information. Can you push clients to provide details for more prep information from these questions?
What is the transformation from the files available for download and what will be stored on filecoin?
How when we sample your deals will we be able to confirm that it has come from the dataset?
How the data is transformed into deals for filecoin.
When a deal is sampled for verification, how will we be able to confirm that it is part of this dataset? (how is it chunked into car files?)
Given a 32GB payload, what steps can an independent entity take to confirm it comes from the relevant upstream dataset?
We have successfully seen datasets be prepared for filecoin, ranging from internet archive, to wikipedia, to the chain itself. You can either store the metadata structure above the individual file chunks as is done by e.g. web3 storage, or you can have a separate well-advertised metadata layer via e.g. a website.
We want to see how a client could be able to make use of this dataset. Can you share details?
this could be a client script for how to iterate through / process over the data this could be a web site allowing browsing / identification of specific pieces of data from the dataset as stored this could be identification of clients making use of the data
This program was recently released and I will refer to it in future applications
Retrieval rates have been increasing Data from two weeks ago
now
https://compliance.allocator.tech/report/f03014490/1724284934/report.md Retrieval rates have been increasing
@Kevin-FF-USA @filecoin-watchdog @galen-mcandrew Hi,Any other questions here, please?
Based on an additional compliance review, this allocator is attempting to work with public open dataset clients, and has 2 primary clients with large allocations (https://github.com/Lind111/EF/issues/9 & https://github.com/Lind111/EF/issues/14). However, the data associated with this pathway is not currently able to be retrieved at scale (starting at only around only 6.8% and rising to 17% overall in July). One client especially remains very low, at only 2.43% retrieval for public open data.
There is evidence of the allocator asking additional questions of the clients, and also checking the compliance report before giving subsequent allocations. For example, the last allocation to client 14 was 6/25. What additional steps are you taking to enforce compliance requirements?
In the initial application, it was stated that you do not currently support clients and SPs using VPN services (question 28). Can you please provide more details about how you are investigating and verifying SP locations?
Can you provide any additional details about how you are selecting and supporting clients to onboard real data to Filecoin? For example, how are you working with existing or new clients regarding their data preparation, as mentioned above?
Based on the increased allocator diligence and retrieval rates (overall still low, but recent client interactions showing improvement), and contingent on replies to the above questions, we would like to request 5PiB of additional DataCap.
For example, the last allocation to client 14 was 6/25. What additional steps are you taking to enforce compliance requirements?
At that time, I retrieved some of the nodes' data for comparison, and the last allocation was made when it was consistent with the application and the retrieval rate continued to increase https://github.com/Lind111/EF/issues/14#issuecomment-2188442959
In the initial application, it was stated that you do not currently support clients and SPs using VPN services (https://github.com/filecoin-project/notary-governance/issues/1056#issuecomment-1890621214). Can you please provide more details about how you are investigating and verifying SP locations?
I will check if the ip posted by the SP is located in the stated area and will ask for proof of geolocation documentation for some SPs who don't confirm it Here are some comments on the issue
https://github.com/Lind111/EF/issues/14#issuecomment-2136737687 https://github.com/Lind111/EF/issues/14#issuecomment-2165055305
https://github.com/Lind111/EF/issues/9#issuecomment-2031211929 https://github.com/Lind111/EF/issues/9#issuecomment-2199090184
Can you provide any additional details about how you are selecting and supporting clients to onboard real data to Filecoin? For example, how are you working with existing or new clients regarding their data preparation, as mentioned https://github.com/filecoin-project/Allocator-Governance/issues/80#issuecomment-2293009995?
I have found that these problems come from #125 ,and check out some of the mixed reviews,I'll take some of these questions and ask them of new clients,for older clients they already have experience with data processing and it looks good!
Record of last review
Below is the current DataCap Application and the latest data available.
https://github.com/Lind111/EF/issues/14 https://check.allocator.tech/report/Lind111/EF/issues/14/1720427926061.md Completed KYC verification Follow-up on the distribution of data is reviewed each time When there is an unapproved missed sps, the client will be asked to explain why https://github.com/Lind111/EF/issues/14#issuecomment-2165002591
SPs disclosed by clients f02827151 Shenzhen, Guangdong, CN f02827694 Shenzhen, Guangdong, CN f02948413 Chengdu, Sichuan, CN f02980903 Seoul, Seoul, KR f02826234
f01926635
f02826123
f02956383
f02901026 f02224274
f03079511 HK; f02327534 US; f02894855 China;
Actual seal up List f02224274 f03079511 f02827151 f02827694 f02948413 f02980903 f02826234 f01926635 f02826123 f02956383
https://github.com/Lind111/EF/issues/9 https://check.allocator.tech/report/Lind111/EF/issues/9/1720427895973.md
This is the client affected by the stfil event Relevant inspections of sps were carried out after the client found new partner sps
The latest data shows that all of the new sps support retrieval except for the sps affected by the stfil incident
Issues from the previous round have been noted and corrected, so more datacap is requested