OryxProject / oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
http://oryx.io
Apache License 2.0
1.79k stars 405 forks source link

SpeedLayer read kafka offset #349

Closed KnifeFly closed 6 years ago

KnifeFly commented 6 years ago

Hi srowen, i am starter to this project. i checkout the code from master, and found file SpeedLayer.java update th kafka offset to the path /consumers in zk:

kafkaDStream.foreachRDD(new UpdateOffsetsFn<>(getGroupID(), getInputTopicLockMaster()));

The master branch use kafka auto commit offset character to save the offset to kafka. So there is no need to use UpdateOffsetsFn to update offset to zk in the end of spark task?

srowen commented 6 years ago

This should be on the mailing list. https://groups.google.com/a/cloudera.org/forum/#!forum/oryx-user Which auto commit are you talking about?