IBMStreams / administration

Umbrella project for the IBMStreams organization. This project will be used for the management of the individual projects within the IBMStreams organization.
Other
19 stars 10 forks source link

Request to create a new streamsx.crossdc-failover repository #138

Closed nysenthil closed 5 years ago

nysenthil commented 5 years ago

Our large enterprise customers run their IBM Streams applications across multiple data centers that are geographically separated. They do this for various business-critical reasons such as load balancing, redundancy, resiliency, high availability, operational continuity etc. Such large customers invariably ask for a way to protect their Streams applications from data center outages (both planned and unplanned). During such data center outages, they want the Streams applications to failover safely and gracefully to the data center that is still active. This important requirement was already asked by some very high profile Streams customers (A major financial transaction company, a major retail bank, a major airlines company, a major healthcare company etc.). This request will keep increasing as IBM Streams makes inroads into other big companies that are currently doing their product evaluation.

We need a generic and robust toolkit on which the customer applications can piggyback to achieve the cross data center failover capability. In collaboration with a large banking customer for the past three months, I have implemented an SPL/Java/C++ based toolkit to aid in the crossDC failover. It contains 100% home-grown code. We are currently testing it at the bank. It will be deployed in production soon. It provides simple hooks via composite operators and Stream connections for any application to seamlessly achieve the following:

1) Periodically get notified about the UP or DOWN status of the application running in the remote DC. 2) Periodically replicate the in-memory state of any customer-written operator in an application to the remote DC. 3) When the application running in the remote DC becomes inactive, take over its operation by owning its in-memory state that was replicated regularly at the local DC.

It is important to think of the three activities mentioned above happening bidirectionally in the local DC as well as in the remote DC under normal working conditions. This toolkit will also provide an easy to understand example that will showcase a closer to real-life scenario with clear directions to demonstrate the local DC/remote DC setup with data replication, abrupt failure of any DC and the operational continuity at the surviving DC. This toolkit will serve both as an educational asset and a reasonable template for any Streams practitioner having to implement a crossDC application solution for the interested customers.

So, I hereby make a request for a new streamsx.crossdc-failover repository to be created.

Thank You.

schubon commented 5 years ago

+1

schubon commented 5 years ago

Done: https://github.com/IBMStreams/streamsx.crossdc-failover

mikespicer commented 5 years ago

While I don't disagree with the value of this toolkit and support the creation of a repository we should allow more time for all to comment. A repo should not be created before there has been time for any objections and concerns to be raised and discussed by more interested parties.

This kind of failover functionality has to deal with many complexities and I would be interested in seeing more detail on the approach taken and design. This could be done as a design in the repo. having these details will also be important for users of the toolkit to understand how it works and evaluate its suitability for their situations.

+1 with the expectation that a design will be posted and discussed in the repo.

nysenthil commented 5 years ago

Hi Mike, Thank you. I can certainly do that. With the amount of customer (technical) work lined up for the next several weeks, I will be severely time constrained to do the design discussions here. So, there will be delays in me doing that.

ddebrunner commented 5 years ago

Repo has been created.