near / queryapi

Near Indexing as a Service
15 stars 3 forks source link

Dedicated control loops per Indexer #811

Open morgsmccauley opened 2 weeks ago

morgsmccauley commented 2 weeks ago

Our current Control Loops manage all Indexers, this is inconvenient for the following reasons:

  1. Individual Indexer tasks cannot block - Any blocking task would block all Indexers. This specifically affects de-/provisioning where we must start the task, and then poll it every subsequent loops.
  2. It's slow - Indexers are processed serially, as the number of Indexers grow, the loop gets slower and slower.
  3. Managing Indexer Lifecycle is complicated - We currently have the New/Existing/Deleted states, but an Indexer has much more than that, and its hard to retrofit these states in a non-blocking way.
  4. Error prone - all errors must be handled gracefully, otherwise we risk affecting all other indexers synchronisation.

I feel we have outgrown the current control loop. Rather than have a single control loop for all Indexers, I'm thinking we can have dedicated loops for each of them. We could spawn a new task for each Indexer, which then manages its own lifecycle. Then each Indexer is free to wait for as long as it wants, without impacting other Indexers. This would allow us to handle the blocking provisioning step much more elegantly and also parallelise the computation.

pkudinov commented 2 weeks ago

Add a grafana metric on the time each control loop takes and based on this determine the priority on this ticket