ipni / storetheindex

A directory of CIDs
Other
75 stars 21 forks source link

Investigate pebble-backed dhstore causing slow ingest rate #2206

Closed masih closed 12 months ago

masih commented 1 year ago

We have had two incidents that resulted in growing ingest lag across providers which were fixed by switching to other dhstore backed incases. Both of these occured while writing to qiu. During this time, both compaction debt and read amplification in pebble DB seemed normal but ingest was clearly slow and remained slow despite indexer restarts until dhstore backend was swapped.

This suggest that the root cause is in dhstore and not indexers nor potential graphsync locking issues.

Review metrics, and investigate why this is happening.

gammazero commented 12 months ago

This was caused by the dhstore node being configured with too low of throughput. This caused a buildup of data waiting to be written. Compaction debt and read amplification seemed normal, because the data was not got getting far enough in the write cycle to affect those things. This was fixed by increasing the throughput setting for the dhstore node.

Added alert when the number of providers with a slow ingestion rate exceeds reasonable expectations.