filecoin-station / spark

💥 Storage Provider Retrieval Checker as a Filecoin Station Module 🛰️
https://filspark.com
Other
5 stars 1 forks source link

SP cannot receive the retrieval request #74

Open nicelove666 opened 1 month ago

nicelove666 commented 1 month ago

In this bot report, https://check.allocator.tech/report/nicelove666/Allocator-Pathway-IPFSTT/issues/10/1717038967330.md, only f02951213 (located in Singapore) is able to receive the retrieval requests from SPARK, showing a retrieval rate of 78.22%. Other SPs from mainland China and Hong Kong are unable to receive the requests.

However, we use the lassie recommended by PL, and the data is normal.

bafykbzacecmmbhohkhcpam6w4dyrh7z7zzvlzg6exkoikmdik6vjzkaqysmpq

WechatIMG6131

We use boost to retrieve, and the data is normal.

boost retrieve -p=f02894875 --o=f02894875_ret.car bafykbzacea5a4inawemc7jme6ea2zr4lmrzl46q4hn2hmp7t6hhyuym5csbes
11353ff

boost retrieve -p=f03035656 --o=f03035656_ret.car bafykbzacecmx5slbkgbnbavvgwwdy3j6zd2lcxbxbeatgfw6xkhp2ukrjqrfw 4a4cefe4d27d47aed5f9dab957eeb041

boost retrieve -p=f01025366 --o=f01025366_ret.car bafykbzacebtqozypibrtu32pgl2sgwa62iypfixarpma74vv5hzlucsnt56ps c101810b468d7532319cdb9630b11821

boost retrieve -p=f02951213 --o=f02951213_ret.car bafykbzaceajt3kphqehmbibqtm3ympsyaadvxtabgwyqmo2utehja3n3hqu 71eb319ee50029f6cb4573b397f9932c

boost retrieve -p=f03074163 --o=f03074163_ret.car bafykbzaceay3a756pifhdrp26gf5a5al3omkzwscxg27apya23hwydk4er7oa 3aaa5d607f562bd5a72f4642609b608b

boost retrieve -p=f02200472 --o=f02200472_ret.car bafykbzacecvws5gq3mo7umdv5ckax5fexfhs4offotk55m6feqhn36mhshyww
cb9a0c3e53c0f2d25131af6779316e96

boost retrieve -p=f03086293 --o=f03086293_ret.car bafykbzacecfdjrheqyxdj2i5x4pwya7kjx5xslf76m4z5j4suat73wfj22wqu ffe87127918241b78c012a6482a459ae

未命名文件 172ccaef56b4925982ab9c35ca5deba4 fa81a5ac5508830374188eb6071c9f77

Is it due to network and regional, Other SPs cannot receive retrieval requests from spark?

bajtos commented 1 month ago

I am not sure how exactly the allocator report calculates the Spark Retrieval Success Rate. I think it's using the overall success rate for all checks performed by Spark, not a rate for this specific organisation or client.

We built a tool that allows anybody to run a Spark retrieval check locally from their machine, see https://filspark.com/troubleshooting-miner-score#block-7e10034eb61e417b9a69383b1e15018f

When I run the check for CID bafykbzacecmmbhohkhcpam6w4dyrh7z7zzvlzg6exkoikmdik6vjzkaqysmpq and miner f02984282, it fails because the CID is not advertised to IPNI.

Calling Filecoin JSON-RPC to get PeerId of miner f02984282
Found peer id: 12D3KooWJ29pkCodQFHyri5DPBtzjYGeD8hDSxQniyTbr8ZU3fTc
Querying IPNI to find retrieval providers for bafykbzacecmmbhohkhcpam6w4dyrh7z7zzvlzg6exkoikmdik6vjzkaqysmpq
IPNI query failed, HTTP response: 404 
Measurement: {
  cid: "bafykbzacecmmbhohkhcpam6w4dyrh7z7zzvlzg6exkoikmdik6vjzkaqysmpq",
  minerId: "f02984282",
  indexerResult: "ERROR_404",
  statusCode: null,
  byteLength: 0,
  providerId: "12D3KooWJ29pkCodQFHyri5DPBtzjYGeD8hDSxQniyTbr8ZU3fTc"
}

In our experience so far, when an SP has a 0% retrieval success rate, the most likely cause is misconfigured advertising to IPNI.

nicelove666 commented 4 weeks ago

Almost all SPs can support direct retrieval using boost and lassie. Except for a few SPs that privately delete unsealed files (stopped cooperation), 2 of them, f02951213/f02894875, have retrieval statistics on spark, and their Network Indexer has normal index data. Although other nodes can be retrieved directly, there is no index data on the Network Indexer. As shown in the figure, these nodes that cannot be retrieved normally display "Error: Fetching provider info from cid.contact: Failed to fetch provider" on the Network Indexer page of boost web. info for 12D3KooWQ46kGufuQz8QAotLg17jQ2TS5SkmEsQNR89mk2UErZYo with status code 404”, and the corresponding sp data cannot be found by directly accessing https://cid.contact/providers. This also causes spark to be unable to retrieve statistics through Network Indexer. We're still trying to figure out why, because it's really weird to see different results for the same version.

WechatIMG1695 WechatIMG21587 WechatIMG21585
nicelove666 commented 4 weeks ago

If you have any progress, welcome to actively communicate with us. We are actively testing multiple solutions. @bajtos

nicelove666 commented 3 weeks ago

We have observed that the SPs with a 0% retrieval success rate are not appearing in the list at https://cid.contact/providers. The Spark retrieval statistics bot requires fetching SP nodes and data indices from cid.contact before it can send retrieval requests to the SPs.

When reviewing the Boost source code, we did not find any behavior to register the SPs with cid.contact. The only action we identified was the invocation of the "https://cid.contact/ingest/announce" endpoint to publish the indices.

At the same time, when examining the storetheindex project's source code, we noted the presence of a register interface that does enroll the SPs with id.contact.

Therefore, we hypothesize that the intended workflow should be: actively scanning the Filecoin network to identify the active SPs and then registering them with id.contact.

However, we have observed that many of the newer SPs are not being registered with cid.contact, resulting in Spark's inability to properly track their retrieval statistics.

Taking our own situation as an example: The majority of the SPs we collaborate with can be successfully queried through Boost and Lassie. However, Spark's retrieval statistics show a 0% success rate for all but a few SPs. When testing with Spark's retrieval checking tool, these SPs return the following error: "Querying IPNI to find retrieval providers for bafykbzaced43qccofqalrzhzdxhi4ok5nige7iyniwfjpukrs5pazfdcqkq32 IPNI query failed, HTTP response: 404".

Figure 1: The "https://cid.contact/providers" endpoint currently returns 289 SP entries, and only these SPs have the potential to be tracked by Spark. WechatIMG1714

Figure 2: Spark's retrieval statistics list shows a total of 63 SPs with a non-zero success rate. Furthermore, it appears that the SPs being tracked are predominantly older, as the newer SP addresses typically start with "f030". WechatIMG1715

Figure 3: In the storetheindex/server/ingest/handler.go source code, we do not see any automatic SP registration behavior when publishing to IPNI. There is only logic to check if the SP exists, and Boost only invokes the index publication endpoint. WechatIMG1716