IBMStreams / administration

Umbrella project for the IBMStreams organization. This project will be used for the management of the individual projects within the IBMStreams organization.
Other
19 stars 10 forks source link

Proposal: Separate Kafka operators from Messaging toolkit #111

Closed cancilla closed 7 years ago

cancilla commented 7 years ago

Proposal

I would like to propose that we split the Kafka operators from the Messaging toolkit and move them into a toolkit, in a new repository. As the Bluemix Streaming Analytics service continues to grow in popularity, the need to send and receive data to and from the Streaming Analytics service to other services/systems is becoming more important. At the moment, the best way to do this is via the MessageHub service, which is backed by Kafka. The current approach of maintaining the Kafka operators in the Messaging toolkit is problematic for the following reasons:

Naming

I would like to propose the following names:

Initial Contribution

The initial toolkit will be created by copying the Kafka operators from the Messaging toolkit with minimal code changes. This will allow existing applications to quickly migrate to the Kafka toolkit without having to perform any major rewrites. This will also serve as a baseline for future development work on the operators.

Migration Process

I propose the following process be taken to migrate the Kafka operators to a new toolkit

  1. Copy current Kafka operators into a new toolkit
  2. Move Kafka-related issues from the Messaging repo to the Kafka repo
  3. Create a release of the Kafka toolkit (baseline release)
  4. Upon releasing the first version of the Kafka toolkit: a. Mark the operators in the Messaging toolkit as deprecated and point users to the Kafka toolkit. b. Update Messaging documentation to reflect that the Kafka operators in the Messaging toolkit are deprecated and that the Kafka toolkit should be used instead.
mikespicer commented 7 years ago

+1

chanskw commented 7 years ago

+1 I suggest that the toolkit and its namespace be named as com.ibm.streamsx.messaging.kafka to ease migration effort for existing customers.

cancilla commented 7 years ago

If I set the namespace for the operators to com.ibm.streamsx.messaging.kafka, will there be a conflict if an application is also using the Messaging toolkit, since that toolkit will have Kafka operators in the same namespace?

chanskw commented 7 years ago

I am wondering if removing Kafka operators from the messaging toolkit is a better option, rather than leaving them in the toolkit and marking them as deprecated.

By removing them, customer will get a compile error saying that the operators no longer exist. They will then be forced to migrate to depend on the new toolkit. Because the namespaces don't change, it's a matter of updating the application dependency and build script.

By leaving them behind, would it be more confusing as we now have two sets of operators and customers do not get compile errors for using the old operators.

cancilla commented 7 years ago

I am not against removing them from the toolkit. However, I am worried if a user is using the messaging toolkit that is packaged with the product. Regardless of what we do with the toolkit on Github, the product will still have Kafka operators in the same namespace, which may introduce conflicts.

chanskw commented 7 years ago

Another way is to update the existing Kafka operators to actually produce a compile warning when being used and point to the operators. Does that make it more easily understood?

cancilla commented 7 years ago

I should have been more explicit when I said "mark as deprecated". What I meant by this was that we would throw a warning at compile-time indicating that the operators have been deprecated. This does 2 things:

  1. Clearly indicates that the operators are deprecated and should not be used. We can even reference the new toolkit in this warning.
  2. Allows applications that are not ready/able to introduce a new toolkit to continue to function. I think this is important because adding a new toolkit to an existing application stack may require updates to build scripts, approvals, additional testing, etc. By allowing the Messaging toolkit to continue to function, it gives developers time to plan their migration to the new toolkit.

I agree that having 2 versions of the same operators can be confusing, but I think we are stuck between a rock and a hard place. My suggestion is to start with the least destructive approach. If we see a lot of cases where users are getting confused, we can reevaluate then.

chanskw commented 7 years ago

Thanks for the explanation.. I agree!

schubon commented 7 years ago

There was already an issue for this under streamsx.messaging: https://github.com/IBMStreams/streamsx.messaging/issues/246.

To drive discussion for the product I had to create an RTC for this and added the participants of above as subscribers. It is not as articulated as the discussion and proposal in this issue here though. Please, can you check for your approval requests and comment in the RTC work item as we'd like to have this split supported in the next product release, too.

Thanks.

ddebrunner commented 7 years ago
  1. Copy current Kafka operators into a new toolkit
  2. Create a release of the Kafka toolkit (baseline release)

I'm not sure we would want to rush into a release, since these would be "new" operators we have a chance to "fix" any issues with the operator parameters etc. Are there any outstanding changes folks know of?

chanskw commented 7 years ago

@ddebrunner I think you have a great point here.

chanskw commented 7 years ago

streamsx.kafka created. Please continue discussion there or in the messaging toolkit.