Open chetanmeh opened 5 years ago
Nice writeup, thanks! Can we potentially normalize between CouchDB and CosmosDB here by listening to the CouchDB changes feed as well for cache invalidation?
Not well versed with CouchDB global replication support. Does it support across regions? Or we need to read from primary and just take care of cache invalidation part using its change feed?
For OpenWhisk setups using CosmosDB its possible to deploy multiple OpenWhisk cluster in different regions via using multi region CosmosDB setup. Purpose of this issue is to track high level changes required to support such deployment
Metadata and CosmosDB
CosmosDB supports multi region deployments where it supports transparent replication of data across regions. In doing that it supports 2 modes
Multi Master
- In this mode multiple regions can handles writes locally. When used with OpenWhisk this mode can be enabled foractovations
collection as objects in this collection are immutable and hence no conflict resolution need to be performedSingle Master
- In this mode only one region handles the write while reads work locally for each region. This mode would be used forwhisks
andsubjects
collectionSeparate Account for Collections
As multi master is needed only for activation we would need to support configuring separate connection details on a per collection basis.
This is now possible via #4198
Cache Invalidation
OpenWhisk cluster uses a Kafka topic
cacheInvalidation
to communicate changes to any cached entity. Messages on this topic are of the formWhen deploying multiple seprate cluster of OpenWhisk which do not share same Kafka instance we would need a way to propagate the cache change event across cluster. For CosmosDB based setups this can be done by using CosmosDB ChangeFeed support. It enables reading changes that are made to any specific collection.
Cache Invalidation Service (#4314)
To propagate changes across cluster we can implement a new service which makes use of the changefeedprocessor library to listen to changes happening in
whisks
andsubject
collection and then convert them into Kafka message events which can be sent tocacheInvalidation
.So each cluster would have its own cache invalidation service running which would ensure that changes done by other clusters are picked up and routed to local cluster
cacheInvalidation
topicHandling Deletes (#4339)
Per change feed support docs there is no direct support for listening to deletes
So we would need to implement such a support in our
CosmosDBArtifactStore
where_deleted
flagIdentifying Cluster making the change (#4312)
Controllers would pickup changes from
cacheInvalidation
topic and then invalidate cached entry if the instanceId of the message differs. When propagating changes from other clusters we would need to determine if change picked via CosmosDB changefeed occurred from current cluster or other cluster. For this purpose we would need to add some metadata while performing any CRUD operation which identifies the current cluster.Code attachments and S3 (#4392)
S3 support replication of a bucket only to 1 other region. For multi region setups we can leverage using CDN like CloudFront to act as a local cache which reduces latency for ready code binary in other regions