mulesoft-catalyst / canary-policy-mule-4

A custom policy to perform canary releases
The Unlicense
0 stars 4 forks source link

Canary Policy for Mule 4

A custom policy to perform canary releases, intercepting the incoming calls and deciding which implementation URL to route the call to. Applying this policy to your API or proxy you would be able to:

Why?

A canary release helps organizations to reduce the risks of introducing new versions of a software by incrementally rolling out traffic to the new version, improving the observability and limiting the impact of the new components over the existing service.

How?

For a canary release to exist, the following elements should be present:

Deployment Architecture

There are no limitations imposed by the use of this policy regarding the deployment architecture, except the need of having two artifacts (applications) deployed in Anypoint Runtime Manager (one for each version). However, the following topology is recommended as it provides a more flexible solution in terms of deprecation and retirement and, also, improves the observability

From the above, a proxy is deployed on top of both versions (original and canary) in order to centralize communication, providing an abstraction and improving understanding from the point of view of networking and traffic management. Please see "Limitations".

If you want to skip the extra layer added by the proxy, you can always apply the policy on top of the original (baseline) application:

This option requires to set an additional flag "appliedOnApi". Please see "Usage". Additionally, it is important to note that if this option is enabled, the collection of metrics does not work.

But this approach may lead to a management nightmare, where deprecation and retirement of APIs become an almost impossible task. See "Deprecation and Retirement" section.

Deprecation and retirement

Ask yourself: What do I want to do to discontinue the original version of my API when the Canary version has been tested and is ready to be used as current version? Here are a series of strategies for that end (when routing by weight is enabled):

Usage

After publishing to Exchange, follow these steps to apply the policy to an existing managed API (or proxy):

Parameter Purpose
Canary Routing Type Choose one of the available routing types for the canary. Routing "By header" will make that every request that contains the header to be routed to the specified canary. If the request does not contains the specified header, the routing will be sent to the Original (base)
Canary Header Header used to route the traffic to the Canary. The absence of this header will route the traffic to the base (original) API. The value of the header is not considered
Host (Original) Details the host for the original version. This should be the same as the one set on the implementation url if using a proxy
Port (Original) Details the port for the original version
Protocol (Original) Details the protocol for the original version
Path (Original) Details the path for the original version
Weight (Original) (Only applicable when routing type is "By Weight") Details the weight for the original version. Represents a percentage that is calculated taking into account a sample of 10 requests. For example: 50 indicates that 5 requests out of 10 will be routed to this endpoint
Host (Canary) Details the host for the canary version
Port (Canary) Details the port for the canary version
Protocol (Canary) Details the protocol for the canary version
Path (Canary) Details the path for the canary version
Weight (Canary) (Only applicable when routing type is "By Weight") Details the weight for the canary version. Represents a percentage that is calculated taking into account a sample of 10 requests. For example: 50 indicates that 5 requests out of 10 will be routed to this endpoint
appliedOnApi Select this option only if the policy is applied on the base API (original) instead of on a proxy. This forces the directive. See https://docs.mulesoft.com/api-manager/2.x/custom-policy-4-reference#basic-xml-structure for further details. When this option is checked, metrics gathering is disabled
Override Object Store Settings? Select this option to override the default Object Store. The default is false.
Is the Object Store persistent? If checked, uses a persistent Object Store )
Object Store's entry TTL The entry timeout. Default value is 1 (hour) )
Object Store's entry TTL unit The time unit. Default value is "HOURS". )

Development

The following commands are required during development phase

Task Command
Package policy mvn clean install
Publish to Exchange - Make sure to update the pom.xml file with your org ID - mvn deploy

Debugging

The following package can be added to the log4j2.xml configuration com.mule.policies.canary In Debug mode, it will print the following checkpoints:

Metrics

The policy incorporates the possibility of collecting usage metrics to later be used in a Canary Analysis process. This option is only available if the Canary is applied on top of a proxy.

To enable the Metrics

Parameter (Internal Name) UI Name Purpose
metricsEndpoint Metrics Endpoint Endpoint used to expose the collected metrics. Must start with a /.
metricsOsIndex Index used to store the metrics Index used to store the metrics. By default uses a unique ID per event
metricsOsPersistent Is the Metrics Object Store persistent? Flag to configure persistent OS
metricsOsTtl Metrics Object Store entry TTL Time to live for the OS (this is applicable either for In-memory and Persistent OS)
metricsOsTtlUnit Metrics Object Store entry TTL unit Time unit for the above TTL

To capture the stored Metrics

Simply send a GET to the endpoint assigned to interact with the metrics. By default, /metrics.

Metrics Structure

The provided metrics follow a java formatted event:

"canary": "0",
"correlationId": "3eb2ee40-5c22-11ec-bf1d-0284a1451db2",
"responseTimeMs": 1191,
"statusCode": 200

where:

IMPORTANT

Limitations

NOTE: The recommended approach to manage Canary Releases should be having a third party component, outside Anypoint Platform, specially designed to handle this kind of needs. For instance, nginx provides a module called split clients, useful to assign percentages of traffic that we want to redirect to defined clients (hosts). The solution provided here is a custom solution, provided by Professional Services and and that may not have official product support.

If you want to use this solution anyway, this approach leads to the following problems (may or may not be applicable to your organization):

Contribution

Want to contribute? Great!