Netflix / eureka

AWS Service registry for resilient mid-tier load balancing and failover.
Apache License 2.0
12.44k stars 3.76k forks source link

2.x State information propagation in the interest channel to the client #384

Closed tbak closed 9 years ago

tbak commented 9 years ago

This is a proposal for state information propagation in the interest channel to the client.

Problem to solve

As interest stream subscription is reactive, there is no way for client to know how much relevant data is available when the subscription starts. If the client is a load balancer, it is critical to wait until available pool of servers is loaded before application requests are allowed, to avoid overloading single server that happens to be first in the list. It may be also advantages to know about bigger changes in the system topology (scale up), and do system reconfiguration when all/majority of the new servers are received via the change notification stream.

Proposed solution

Generate buffering sentinels after each last know item in the change notification stream:

Figure 1. Buffer sentinels in change notification stream

As buffer sentinels are not optimal from the transport perspective, internal implementation is using batching markers, that are transmitted over the wire together with regular change notification data.This concept is depicted by the figure below:

Figure 2. Batching markers in change notification stream

There are two possible sources of batching hints:

The concept is depicted in this figure:

Figure 3. Batching markers implementation

One more level of complexity is added by the way interests are handled in the transport channel, and the index registry. For efficiency purposes, to avoid sending/processing the same data if different subscribes ask for same or overlapping data, the interests are handled at an atomic level. For example:

Batch hints are implemented as a new kinds of ChangeNotification:

public class ChangeNotification<T> {
    public enum Kind {Add, Delete, Modify, Buffer, BufferingSentinel}
    ...
}

Internally a derived class StreamStateNotification is used to carry additional information. It is not however visible to the client.

qiangdavidliu commented 9 years ago

For the proposed hint markers, it seems the Buffer hint is always sent for all cases described. From a behaviour point, what does the Buffer hint offer? It seems that consuming clients and/or operators only need to listen for the finishBuffer hint for an optimised buffering experience regardless of whether there are prior Buffer hints. E.g. the consumer should be able to apply a collection operator to the stream that emits a new List each time it sees a finishBuffer hint, and possibly timeout otherwise.

On the client side, it seem to make more sense that the source of the hints is only the registry, as it should be the source of truth for all data. On the server side, this is naturally the case, and on the client side the hints should be a merge of local registry hints plus server side hints if any are available. If we make sure the hints are only generated by the registry, we should then be able to merge the multiple hints emitted by atomic interests for composite forInterests so that clients only receive a single finishBatching marker (the logic would then be that a finishBuffer is emitted once all atomic FinishBuffers are received at the merge point).

NiteshKant commented 9 years ago

I think it will be useful to provide code samples on how a client will consume this API. There are a lot of complex constructs as conditional batching and non-batching modes. I would like to see how this manifests on the consumer end.

A few implementation related questions:

I am not really convinced about the need of "Batch" hint, however, I can see what you are trying to achieve i.e. the ability for the same client to switch between batching & non-batching mode. This is something will be cleared to see in a code example whether the complexity is worth.

@qiangdavidliu

the consumer should be able to apply a collection operator to the stream that emits a new List each time it sees a finishBuffer hint, and possibly timeout otherwise.

I think there is value in having an API where batch or non-batch is not a choice that the client is to make. In this model, the client will always batch & non-batch interaction will be timeout based.

tbak commented 9 years ago

Implemented by PR #403