Closed jtolio closed 1 year ago
Initial brainstorm document: https://docs.google.com/document/d/1sAYCfuk23gWI93CaR0rLyhiufV3mdOkYPkVdNseXgKg/edit#
We need to push this item back to Sprint 12 because we have paused this item in favor of finishing our invoice cycle on EU1. Michal, who is leading this project, will also be on PTO next week. The only items left are finishing deploying GC to EU1.
As of 4/21/23, all customer-facing satellites are running on ranged loops! The exception is GC on EU1 and US1 are running on a single range due to an OOM issue.
@shaupt131 @iglesiasbrandon @mniewrzal we need to update this roadmap Item on 4/24 and decide if we can close this out officially!
analysis from @mniewrzal
In regards to question if we can scale 3,5x and if ranged loop will handle that. Lets take US1 for analysis. Currently we have ~850M segments in table. Ranged loop for US1 is processing 50k-60k segments per second using 3 ranges. If we will take lower value it means that it will finish in about 4,6h. If we would have 3.5x segment it would be around 3B entries. Even if we will use the same numbers finishing loop with such amount of segments should take ~17h. Far from ideal but we had worse times in the past. But if we will just double number of ranges to 6 we should be able to process everything in 8-9 hours. We still have some tickets to speed up performance without additional resources but even now we should be able to scale only by adding more cpu and memory to ranged loop instance. For example we did it for EUN1 and there we are processing 100-110k segments per second.
We can mark this roadmap item as completed. Any little follow-up tasks will be done as one-offs @shaupt131
Summary:
Right now, for many background tasks such as accounting, repair, auditing, garbage collection, etc., we process all objects sequentially, one by one, and as a result, the number of objects directly influences how long it takes to run one of these jobs.
Pain Point:
As the number of objects grows, the accounting granularity will get worse and worse, and eventually, we won't be able to do daily accounting rollups unless we intervene. This also impacts how quickly we can react in repair scenarios, and how frequently objects may get audited. This is a pressure cooker with no release valve. We need to build the release valve.
Intended Outcome:
We can get through each task that observes the metainfo loop (accounting, repair, auditing, gc, etc), without requiring a single sequential sweep of all objects. We would like to be able to horizontally scale the metainfo loop so that we can have more subsections running in parallel.
This will probably require individual solutions for each of the existing metainfo observers - garbage collection will likely be handled differently than accounting or repair checking, for instance.
How will it work?
The broad approach is to have multiple cores running concurrently, processing portions of the metainfo table at the same time.
For metainfo observers that we can more easily eliminate the need for the metainfo loop altogether (such as efficient reverse indexes, etc), we should spend some timeboxed research time evaluating that as well.
Milestone: https://github.com/storj/storj/milestone/25