filecoin-station / spark

💥 Storage Provider Retrieval Checker as a Filecoin Station Module 🛰️
https://filspark.com
Other
4 stars 1 forks source link

Shorten Spark rounds to 20 minutes #50

Closed bajtos closed 5 months ago

bajtos commented 5 months ago
juliangruber commented 5 months ago

Updated the live contract

bajtos commented 5 months ago

There are two more parameters we need to review and tweak after changing the round length:

See https://github.com/filecoin-station/spark-api/blob/dbb6606d46dee2a12164f4b5cb901086dffdcdd5/lib/round-tracker.js#L11-L14

I decided to keep these values unchanged.

While 5m req/hour may seem high, the clients randomly choose one of 4000 CIDs to test, which means our network sends 0.34 req/sec/CID.

juliangruber commented 5 months ago

0.34 req/sec/CID

This seems high. So in 3 seconds we will have tested the entire CID set?

bajtos commented 5 months ago

0.34 req/sec/CID

This seems high. So in 3 seconds we will have tested the entire CID set?

Sorry, now I see how my description is confusing.

We pick 4000 CIDs to test each round. Let's say you are a big SP with many deals and 100 of the CIDs you store end up on that list of 4000 CIDs.

32 req/sec seems high, but if our random selection picks 100 of your CIDs for our list of 4000 CIDs, then it means you have 100/4000=2.5% of FIL+ LDN storage deals.

Anyhow, I don't have a strong opinion here. If you think we are making too many requests, then I am happy to tweak the parameters to reduce that number. What would be a reasonable target?

juliangruber commented 5 months ago

I don't have an informed opinion here. I'm happy to wait until someone complains, if that ever happens. Do you have a number how many req/sec the average storage provider will be receiving?

bajtos commented 5 months ago

I don't have an informed opinion here. I'm happy to wait until someone complains, if that ever happens.

👍🏻

Do you have a number how many req/sec the average storage provider will be receiving?

We are not tracking this information right now. We don't have SP<->retrieval association yet. But maybe we can use provider_address as the proxy value and calculate how many requests we send to the same provider_address. WDYT?

juliangruber commented 5 months ago

That would be useful to know yes!

bajtos commented 5 months ago

Created https://github.com/filecoin-station/roadmap/issues/64 and added it to M4.2.

bajtos commented 5 months ago

There are two more parameters we need to review and tweak after changing the round length:

  • TASKS_PER_ROUND (currently 4000)
  • MAX_TASKS_PER_NODE (currently 60)

See https://github.com/filecoin-station/spark-api/blob/dbb6606d46dee2a12164f4b5cb901086dffdcdd5/lib/round-tracker.js#L11-L14

I decided to keep these values unchanged.

  • We will test the same number of CIDs per round, which means we will test 3x more CIDs each hour/day/week
  • We increase the load on SPs 3x. Retrieval bandwidth per hour will increase from ~45MB to ~135MB and number of requests from ~1.6m to ~5m per hour.

While 5m req/hour may seem high, the clients randomly choose one of 4000 CIDs to test, which means our network sends 0.34 req/sec/CID.

So, there is another constant affecting the load - APPROX_ROUND_LENGTH_IN_MS in Spark Checker source code. Because we kept it unchanged, the checker nodes used the same delay delay between tasks as before.

I opened a PR to change that constant to match the actual round length: