Closed miguno closed 10 years ago
Hi Michael,
Thanks a lot for your email. I have been following your blogs and post very regularly on Stom and Kafka and those are excellent learning for me and this is great to see your email and your help to guide on this License issue.
Yes, you are correct, the original code is similar to Kafka Storm Spout. In the original Storm Spout Code has two part, first is Fault Tolerant Kafka Connectors ( The ZKCoordinator , DynamicBroakersReader, DynamicPartitionConnection etc) which are almost similar logic in my Spark Connector (with few minor modification to detect Spark Driver/Executor failures and Replay logic) ... But there are good amount modifications done in The Storm part of the code ( Specially there is no KafkaSoout) , where I used KafkaConsumer, the PartitionManager logic has changed to fit to Spark..etc. There is no Storm specific ack related code, no metrics are kept etc.
Can you please guide me for this cases, how do I proceed ? I am new to this process sorry about this.
I can see following point for redistribution...
a. You must give any other recipients of the Work or Derivative Works a copy of this License; and
If I include Apache V2 license , will this point be covered ?
b. You must cause any modified files to carry prominent notices stating that You changed the files; and
Shall I mention in the .Java / Readme files that this is modified from Storm Kafka Spout ? Will that help ?
C. You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
What I need to do here ?
D. If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
This work does not include any NOTICE text. So Do I need to do anything here ?
Regards, Dibyendu
On Friday, 26 September 2014 2:08 PM, Michael G. Noll notifications@github.com wrote:
Dibyendu, first, thanks for your work on providing an improved Kafka consumer for Spark Streaming. Much appreciated! I have been playing around with Kafka and Spark Streaming myself, and stumbled upon your project in the spark-user thread where you announced it last month. Since there are apparently still a couple of issues (including Spark issues) to be ironed out, I began reading your source code for further details on the current status of Kafka support in Spark Streaming -- actually because I thought that "Hey, the Apache Storm project has a reasonable Kafka spout/connector, maybe that code would help the Spark project to improve their own variant." While reading your source code that I noticed that apparently most of the code is a verbatim copy of the Kafka spout of the Apache Storm project, which was originally created by wurstmeister. In both cases the code is licensed under the Apache License v2.0, which means you can't just copy the code -- there are some rules you must follow. (And both Apache Spark and Apache Storm, as ASF projects, are using the very same license, which also means it's easy to share code amongst the projects.) Notably, "you must give any other recipients of derivative work a copy of that license, you must cause any modified files to carry prominent notices stating that you changed the files, and you must retain, in the source form of any derivative works that you distribute, all copyright, patent, trad emark, a nd attribution notices from the source form of the work, excluding those notices that do not pertain to any part of the derivative works". See Apache License v2.0 for details of what you would have to do/change/add/etc. to be license compliant. I am sure you have done this in good faith, and I am making you aware of this issue primarily to help you. Best wishes, Michael — Reply to this email directly or view it on GitHub.
Hi Mike, I have added the License File and also included comments in every Java file that code has been taken from Storm Kafka Spout and Modified for Spark Streaming. Let me know if this is fine now. Do I need to add anything in LICENSE file for copyright section ?
Dibyendu
(Disclaimer: I'm not a licensing expert either.)
I think you can shorten the following sentence in the newly added licensing headers.
Kafka Spark Consumer code is taken from Kafka spout of
the Apache Storm project (https://github.com/apache/storm/tree/master/external/storm-kafka)
which was originally created by wurstmeister (https://github.com/wurstmeister/storm-kafka-0.8-plus).
This file has been modified to work with Spark Streaming.
to just
This file is based on the source code of the Kafka spout of the Apache Storm project.
(https://github.com/apache/storm/tree/master/external/storm-kafka)
This file has been modified to work with Spark Streaming.
Again, I'm not an expert either. :-)
Best, Michael
PS: I noticed that the commit in which you added the license headers also includes functional changes, see this example.
- ssc.checkpoint(checkpointDirectory);
+ // ssc.checkpoint(checkpointDirectory);
Was this intentional? There were a couple of such functional changes, which seem to have been conflated with the licensing changes.
Do I need to add anything in LICENSE file for copyright section ?
This is up to you.
You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
Thanks again. I added a NOTICE file, and modified the newly added license header to what you mentioned. And some functional changes which gone in with License changes are okay. I have not added any copyright section in LICENSE file.
Hi Michael,
Let me know if I can close this Licensing issue if everything looks okay ?
Regards, Dibyendu
I think you can close it, Dibyendu.
--Michael
On 27.09.2014, at 10:08, Dibyendu Bhattacharya notifications@github.com wrote:
Hi Michael,
Let me know if I can close this Licensing issue if everything looks okay ?
Regards, Dibyendu
— Reply to this email directly or view it on GitHub.
Thanks Michael.
Dibyendu,
first, thanks for your work on providing an improved Kafka consumer for Spark Streaming. Much appreciated!
I have been playing around with Kafka and Spark Streaming myself, and stumbled upon your project in the spark-user thread where you announced it last month. Since there are apparently still a couple of issues (including Spark issues) to be ironed out, I began reading your source code for further details on the current status of Kafka support in Spark Streaming -- actually because I thought that "Hey, the Apache Storm project has a reasonable Kafka spout/connector, maybe that code would help the Spark project to improve their own variant."
While reading your source code that I noticed that apparently most of the code is a verbatim copy of the Kafka spout of the Apache Storm project, which was originally created by wurstmeister. In both cases the code is licensed under the Apache License v2.0, which means you can't just copy the code -- there are some rules you must follow. (And both Apache Spark and Apache Storm, as ASF projects, are using the very same license, which also means it's easy to share code amongst the projects.) Notably, "you must give any other recipients of derivative work a copy of that license, you must cause any modified files to carry prominent notices stating that you changed the files, and you must retain, in the source form of any derivative works that you distribute, all copyright, patent, trademark, and attribution notices from the source form of the work, excluding those notices that do not pertain to any part of the derivative works". See Apache License v2.0 for details of what you would have to do/change/add/etc. to be license compliant.
I am sure you have done this in good faith, and I am making you aware of this issue primarily to help you.
Best wishes, Michael