Open creese opened 5 years ago
:exclamation: No coverage uploaded for pull request base (
master@cee3aba
). Click here to learn what that means. The diff coverage is15.62%
.
@@ Coverage Diff @@
## master #200 +/- ##
=========================================
Coverage ? 78.18%
=========================================
Files ? 43
Lines ? 2530
Branches ? 151
=========================================
Hits ? 1978
Misses ? 401
Partials ? 151
Impacted Files | Coverage Δ | |
---|---|---|
src/jackdaw/streams/xform.clj | 11.53% <11.53%> (ø) |
|
src/jackdaw/streams/xform/fakes.clj | 33.33% <33.33%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update cee3aba...9efe7b7. Read the comment docs.
Looking good! I would add examples of unit tests of the actual word count transducer. Also what happens to the simple ledger tests?
I spoke to @blak3mill3r (the author of Noah), yesterday about how he's implemented stateful transducers in Noah. He came up with a broadly similar solution to what we have here, it seems like the use of volatile!
within the transducer code is a real sticking point. The main difference between our solutions is that in Noah, all the transducers use a single state store rather than each transducer having its own. We weren't sure what the performance implications of that would be, but it's worth bearing in mind in case we run into perf issues in future.
We also discussed starting a shared library for all the core transducers re-written to support persisting their state, so that they can be used with Jackdaw and Noah. Blake's going to set this up, then I thought we could potentially pull this in in Jackdaw. There are some open questions around this though, such as what we do about other popular transducer libraries like xforms.
Here is that shared library which reimplements (the transducer arity of) all of the functions in clojure.core
that return a stateful transducer:
https://github.com/blak3mill3r/coddled-super-centaurs
That function is then bound twice by noah
to instrument the transducer state and tie it into a StateStore
:
https://github.com/blak3mill3r/noah/blob/5803dd5/src/noah/transduce.clj#L34-L35 https://github.com/blak3mill3r/noah/blob/5803dd5/src/noah/transduce.clj#L85-L86
Also, @DaveWM ... I checked, and as far as I can tell, there aren't any stateful transducers in xforms or kixi.stats. They have interesting higher-order transducers and reducing fns, and (I think...) these should work fine composed with these instrumented stateful transducers.
Also I want to clarify regarding: "all the transducers use a single state store rather than each transducer having its own"
Each time you transduce a KStream, if that transduction needs state, you must provide a store. That transducer can of course be a composition of several transducers, any of which can be stateful, and all of the states for these composed transducers will be stored together in a clojure vector as the record in the state store. To transduce multiple KStreams, you would use multiple state stores.
Plz merge this already 😛 !!
Bump!
This has been a long time coming but I think we’re finally here. This proposal is composable with the existing Jackdaw Streams DSL. Just define your transducers and use
transduce-kstream
:It turns out that KStream::transform followed by KStream::flatMap is equivalent to transduce with concat. We can use the latter to test our business logic with pure Clojure (no Kafka Streams). This approach was pioneered by Matthias awhile ago. The difference is now we're adding state.
Here is how to test your transducers:
The function
xf-running-balances
takes two arguments, a "store" and a function that "behaves like clojure.core/swap!" and returns a transducer. When developing your tranducers, you can use an atom and swap!.When using your tranducers from Kafka Streams, no changes are needed. You supply different arguments. The examples show how to provide a state store and a helper function defined in
jackdaw.streams.xform
. However, if this doesn't work for you, you can write your own.Here is the topology:
This PR contains examples for Word Count and the Simple Ledger.