Closed chrisdennis closed 8 years ago
Depends on #1077
As commented on PR #1120 there is still some clean up to do as the clear messages did introduce a ServerStoreOp
that does not require a key. And for the moment the design forces it to specify one.
My previous comment is on the wrong issue - the clear message was introduced by #1075 and so the fix needs to happen under that issue.
Design Notes on Cache Invalidation
There are three main problem areas to consider when broadcasting or multicasting a change (such as an invalidation) to clients:
Problem 1. Which clients to notify the change?
Should we broadcast the change to all clients connected to the store or should we multicast to only those clients that has expressed an interest in that particular mapping. A client's interest is known by the server when the client is doing a get on the store for a given hash.
For multicast, we will need to track the client list for a given hash. Several space efficient techniques are possible, including techniques that may result is some false positives. Some of these techniques are considered in this design notes.
Problem 2. Cache consistency configuration
For strong consistency, the client that issued the change must be notified after all the other clients has acknowledged the change notification to the server. The mutation API on the client (such as Put) must wait for this notification before returning an acknowledgement to its client for the mutation. For eventual consistency, the client issuing the change does not have to wait. Also for eventual consistency, there is no need for server to get acknowledgements from the client. It can be a pure 'fire and forget' interface as the underlying transport offers guaranteed delivery as long as the client is up and connected. If the client is disconnected, it will invalidate all its mappings anyway.
Due to the above, the server side must know whether the store is configured for strong or eventual consistency and act accordingly. Also since the server side deals with only blobs we need to decide on what data to send when the change is broadcasted. Should we send just the hash(key) and expect that the client somehow knows how to derive the key(s) from hash?. Or should we extract the key from the blob and send?. Or should we extract the entire blob and send when broadcasting the change?.
All these questions are answered in the design notes after making some assumptions.
Problem 3: Constrained resource usage on the Server
The change broadcast communication from the server to the clients must be completely asynchronous to avoid holding costly server's stage threads waiting for responses from the client. Voltron mechanisms are available to achieve this asynchronous communication.
Voltron dictates that all communication from server MUST be asynchronous. This is to avoid server threads waiting and consuming resources of the stage affecting the overall scalability and responsiveness of the system.
Problem 4: Active/Passive failover impact for the invalidations.
How does a active/passive failover impact the invalidations. This is still work in progress.
Considering the effort required to handle all the above, the work will be broken into two iterations, Iteration 1 and Iteration 2.
Iteration 1 Scope:
Handle strong and eventual consistency, but broadcast the change to all 'live' clients that is connected to the store. With this approach there is no need of tracking the clients per hash. We can reuse the client tracking done in EhcacheActiveEntity for mapping all the clients that is connected to a given store on the server side.
Iteration 2 Scope (May be de-prioritized and taken up much later):
Track client activity per hash on the server so that a change to a mapping can be multicast to only those clients that has the mapping.
Iteration 1 Design and Implementation Notes
The following are the assumptions made for this design. Changes in the assumption will accordingly change the design approach.
The following set of sequence diagram are an addendum to Clifford's diagram on EhcacheActiveEntity that handles the cache invalidations. The following diagram shows the cache invalidation protocol for eventual consistency.
The following diagram shows the cache invalidation protocol for strong consistency.
For the first iteration no new classes need to be added on the server side and a single private method in EhcacheActiveEntity should do the job. It will interact with Voltron's ClientCommunicatorServerManager API. On the client side, EhcacheClientEntity also will need some changes to connect to the client side of Voltron's ClientCommunicator support.
Issues to be sorted out pre-implementation for iteration 1.
Iteration 2 Design and Implementation Notes
In iteration 2, we can add the multi-casting feature to ensure that a change is broadcast to only the clients that has cached the mapping in its caching tier. The following sequence diagram illustrates the interactions between classes to achieve this multicasting.
< TODO: Work In Progress. This diagram is still work in progress and may be taken up for further work only if Iteration 2 is prioritized at some point >
Bloom Filters for tracking clients per hash
A space efficient way to track clients per mapping is to use bloom filters. When using bloom filters we need to consider the following complexity:
-Note: The above is work in progress.-
Using a client bit map per hash
If we can safely assume that there will not be many clients per mapping (although there may be 1000s of clients connected to a store), we can keep a bit map of client per hash. As the number of clients increases, the size of the bit map will increase. But by using a combination of single byte number and a 8-bit bit map, we can reduce the space requirement per hash for this client bit map. Also with this we can keep a maximum of 65536 clients per mapping that grows dynamically (1 short at a time per mapping). If there are more than 65536 clients, we can safely broadcast to clients who are above this number. So this approach does not limit the number of clients that can connect to a store in anyway.
The above requires that when a client connects its client descriptor is mapped to an integer that can be used as an index into a bit map. When a client disconnects, the index slot is freed. This allows for two things:
TODO: Create diagrams and descriptions to explain the two approaches above. Describe pros and cons of the approaches. This is still work in progress