apache / openwhisk

Apache OpenWhisk is an open source serverless cloud platform
https://openwhisk.apache.org/
Apache License 2.0
6.54k stars 1.17k forks source link

Multi Region OpenWhisk deployment using CosmosDB #4277

Open chetanmeh opened 5 years ago

chetanmeh commented 5 years ago

For OpenWhisk setups using CosmosDB its possible to deploy multiple OpenWhisk cluster in different regions via using multi region CosmosDB setup. Purpose of this issue is to track high level changes required to support such deployment

Metadata and CosmosDB

CosmosDB supports multi region deployments where it supports transparent replication of data across regions. In doing that it supports 2 modes

  1. Multi Master - In this mode multiple regions can handles writes locally. When used with OpenWhisk this mode can be enabled for actovations collection as objects in this collection are immutable and hence no conflict resolution need to be performed
  2. Single Master - In this mode only one region handles the write while reads work locally for each region. This mode would be used for whisks and subjects collection

Separate Account for Collections

As multi master is needed only for activation we would need to support configuring separate connection details on a per collection basis.

This is now possible via #4198

Cache Invalidation

OpenWhisk cluster uses a Kafka topic cacheInvalidation to communicate changes to any cached entity. Messages on this topic are of the form

{"instanceId":"controller0","key":{"mainId":"guest/hello"}}

When deploying multiple seprate cluster of OpenWhisk which do not share same Kafka instance we would need a way to propagate the cache change event across cluster. For CosmosDB based setups this can be done by using CosmosDB ChangeFeed support. It enables reading changes that are made to any specific collection.

Cache Invalidation Service (#4314)

To propagate changes across cluster we can implement a new service which makes use of the changefeedprocessor library to listen to changes happening in whisks and subject collection and then convert them into Kafka message events which can be sent to cacheInvalidation.

So each cluster would have its own cache invalidation service running which would ensure that changes done by other clusters are picked up and routed to local cluster cacheInvalidation topic

Handling Deletes (#4339)

Per change feed support docs there is no direct support for listening to deletes

The change feed includes inserts and update operations made to documents within the collection. You can capture deletes by setting a "soft-delete" flag within your documents in place of deletes

So we would need to implement such a support in our CosmosDBArtifactStore where

  1. Upon delete we set a _deleted flag
  2. Fliter out such soft deleted entries from query results
  3. Use this flag in cache invalidation service to detect deletes

Identifying Cluster making the change (#4312)

Controllers would pickup changes from cacheInvalidation topic and then invalidate cached entry if the instanceId of the message differs. When propagating changes from other clusters we would need to determine if change picked via CosmosDB changefeed occurred from current cluster or other cluster. For this purpose we would need to add some metadata while performing any CRUD operation which identifies the current cluster.

Code attachments and S3 (#4392)

S3 support replication of a bucket only to 1 other region. For multi region setups we can leverage using CDN like CloudFront to act as a local cache which reduces latency for ready code binary in other regions

markusthoemmes commented 5 years ago

Nice writeup, thanks! Can we potentially normalize between CouchDB and CosmosDB here by listening to the CouchDB changes feed as well for cache invalidation?

chetanmeh commented 5 years ago

Not well versed with CouchDB global replication support. Does it support across regions? Or we need to read from primary and just take care of cache invalidation part using its change feed?