cloudfoundry / diego-notes

Diego Notes
Apache License 2.0
23 stars 7 forks source link

Converger should prune ActualLRPs/Tasks that aren't actually on their assigned cells #10

Closed onsi closed 9 years ago

onsi commented 9 years ago

Today the Rep is responsible for identifying ActualLRPs/Tasks that the BBS thinks are running on the Rep's Cell but aren't actually running on the Rep's Cell.

This has two deleterious effects:

  1. It complicates the Rep's code & responsibilities. Instead of strictly flowing information from the containers into the BBS it must also fetch entries in the BBS and then identify Tasks/ActualLRPs that do not correspond to containers.
  2. The Rep must periodically fetch all Tasks/ActualLRPs associated with it. This is full dump of the entire ActualLRP and Task table in etcd.

Meanwhile, the Converger is periodically fetching all the state in etcd and acting to ensure that it attains eventual consistency. One of the Converger's responsibilities is to prune ActualLRPs and Tasks that are associated with failed Cells.

It seems natural that the Converger should also reach out directly to the Cells to verify that they have containers for all ActualLRPs and Tasks that the BBS thinks they have.

This proposal captures this change.

onsi commented 9 years ago

Added a story here: https://www.pivotaltracker.com/story/show/85741714

fraenkel commented 9 years ago

This seems perfectly reasonable but has me raising other questions about the Rep and its true role. We have these processes with names that have a variety of capabilities. We have given these capabilities names more like Actor descriptions. If I look at the Rep, I don't quite see its real purpose as a separate process other than an intermediary that provides little direct value. Is the remaining function within the Rep providing value by being in a separate process?

onsi commented 9 years ago

@fraenkel the Rep's history is long and elaborate. It used to be in the executor. We pulled it out early into its own process and have been refining its role ever since. It might be worth considering pulling its (now fairly minimal) footprint into the executor again (and getting rid of the executor API completely?)

tedsuo commented 9 years ago

The only reason the rep is a separate process from the executor was to ensure the bbs/state-machine code didn't get glued too directly to the garden-wrapper code, so they we could work on the two pieces in parallel. But if the rep becomes small enough, and it doesn't over-complicate the executor, it's reasonable to fold it back in to the executor, and no longer have the executor API (only the rep api). So far we've had enough trouble modeling these domains that I've been happy that they're separate. But I hope we eventually arrive with a model that matches the domain and is easy to reason about.

Two other concerns, which are maybe related:

onsi commented 9 years ago

We discussed this at the IPM and then offline is smaller groups. At the heart behind this proposal are two concerns:

  1. code cleanliness
  2. performance (rep pulling in the entire table every 30s will not scale)

1 seems to have been largely resolved at this point. The best solution to 2 is to move away from etcd. We have performance analysis stories coming out that will show us how much we actually need to worry about this. Until then I'm iceboxing this story and reverting this commit.