stellar / go

Stellar's public monorepo of go code
https://stellar.org/developers
Apache License 2.0
1.29k stars 503 forks source link

services/horizon: Remove --paralleljobsize config parameter #5468

Open urvisavla opened 1 week ago

urvisavla commented 1 week ago

This is a breaking change so it would be good to include it in the upcoming protocol 22 release.


Original Description (Keeping it for context): TL;DR For reingestion using the BufferedStorageBackend, the ParallelJobSize parameter is currently set to a default of 100. This creates a conflict with the buffer_size parameter in the BufferedStorageBackend configuration. To fix this, we should either remove the ParallelJobSize parameter for reingestion using BufferedStorageBackend or set it to a larger value by default.

More info: The ParallelJobSize parameter determines how many ledgers are processed in each task during parallel ingestion. It is set to 100k by default for Captive Core ingestion. This number was determined based on how many ledgers Captive Core can efficiently replay at once. However, when using the BufferedStorageBackend for reingestion, ParallelJobSize doesn’t really matter because the BufferedStorageBackend is stateless and isn’t affected by the number of ledgers assigned to it.

The BufferedStorageBackend has buffer_size config parameter which determines how many files are kept in memory. By setting ParallelJobSize to 100, we limit the usefulness of buffer_size. So for example, even if buffer_size is set to 1000, each task will still only process 100 ledgers due to the ParallelJobSize setting.

While ParallelJobSize doesn’t directly impact performance with the BufferedStorageBackend, it could be helpful to set it to a larger valu like 100k for more efficient reingestion of large ledger ranges. When reingesting larger ledger range, the older ledgers are smaller and faster to process. This means some workers might finish their tasks faster than others. By dividing the workload into smaller tasks, those faster workers can take on additional work, resulting in better efficiency.

tamirms commented 1 week ago

@urvisavla does the buffer_size config parameter have a default value?

You mentioned:

To fix this, we should either remove the ParallelJobSize parameter for reingestion using BufferedStorageBackend or set it to a larger value by default.

What about the option of overriding buffer_size so that it is equal to ParallelJobSize?

urvisavla commented 1 week ago

@urvisavla does the buffer_size config parameter have a default value?

Not currently but Shawn's PR adds default values based on the datastore schema https://github.com/stellar/go/pull/5462

You mentioned:

To fix this, we should either remove the ParallelJobSize parameter for reingestion using BufferedStorageBackend or set it to a larger value by default.

What about the option of overriding buffer_size so that it is equal to ParallelJobSize?

Currently, if buffer_size is set above 100, it is effectively overridden by ParallelJobSize(=100) because each task will process a range 100(ParallelJobSize) ledgers. The Ledger Buffer never queues ledgers past the end of the specified range.

Explaining the interaction between ParallelJobSize and buffer_size will be challenging and most of the time reingestion performance using BufferedStorageBackend will depend on setting the right buffer_size.

The best solution, imo, would be to deprecate ParallelJobSize entirely. I doubt many users know how to use it effectively even for Captive Core. We can set ParallelJobSize to 100,000 by default for both Captive Core and BufferedStorageBackend reingestion. This way, users only need to focus on tuning two things: 1) buffer_size and num_workers (download) based on their data lake schema and 2) the number of parallel reingestion workers depending on their hardware, beefier machines can handle more parallel reingestion.

@sreuland @mollykarcher

tamirms commented 1 week ago

Currently, if buffer_size is set above 100, it is effectively overridden by ParallelJobSize(=100) because each task will process a range 100(ParallelJobSize) ledgers. The Ledger Buffer never queues ledgers past the end of the specified range.

ok, I understand now, thanks for the explanation!

We can set ParallelJobSize to 100,000 by default for both Captive Core and BufferedStorageBackend reingestion.

I looked through the code and it seems difficult to determine the effect of the ParallelJobSize parameter because it turns out the parameter is just a suggestion. We also have a MaxLedgerPerFlush configuration parameter which further complicates the picture. So, I think we need to rethink ParallelJobSize and try to simplify how the value is derived.

But, I don't think the buffer_size in the BufferedStorageBackend toml file should influence the batch size of each reingestion worker. IMO it makes more sense if the the buffer_size in the BufferedStorageBackend config is set to the minimum between the value present in the toml file and the batch size of each reingestion worker.

urvisavla commented 6 days ago

I looked through the code and it seems difficult to determine the effect of the ParallelJobSize parameter because it turns out the parameter is just a suggestion. We also have a MaxLedgerPerFlush configuration parameter which further complicates the picture. So, I think we need to rethink ParallelJobSize and try to simplify how the value is derived.

Yes, I was suggesting that the parallel job size should not be user configurable; it's something we can set to a fixed value or as you mentioned calculate internally.

But, I don't think the buffer_size in the BufferedStorageBackend toml file should influence the batch size of each reingestion worker.

I'm not suggesting that the buffer_size in should influence the batch size but that the reverse is true.

IMO it makes more sense if the the buffer_size in the BufferedStorageBackend config is set to the minimum between the value present in the toml file and the batch size of each reingestion worker.

Agreed, this is essentially the implict effect but we can make it explicit.

tamirms commented 6 days ago

Yes, I was suggesting that the parallel job size should not be user configurable; it's something we can set to a fixed value or as you mentioned calculate internally.

sounds good! MaxLedgerPerFlush will ensure that ingestion does not consume too much memory during ingestion, so it should be safe to derive ParallelJobSize from the size of the total ledger range and the number of ingestion workers. I am on board with your plan to deprecate the ParallelJobSize parameter.

sreuland commented 4 days ago

+1 to deprecate the --parallel-job-size cli param in favor of internal optimization, less external config, bonus!

urvisavla commented 8 hours ago

Thanks @tamirms @sreuland. I've updated the description and changed the scope from a bug to a feature.