Open kovasb opened 8 years ago
We'll take a look at this. Thanks for the idea! :)
Cool. It seems to have huge momentum and many different kinds of use cases.
For instance u can put the rocksdb data on there, and have it transparently flush cold data to different storage tiers. Or move it around from machine to machine.
My use case is more for batch data processing, where workers can write data to tachyon and let it deal with moving from machine to machine (and persistent to permanent storage), with onyx orchestrating work unit distribution.
It's going to be a while before we can look at it for that kind of use case (Feb-March), we have other things that are higher priority at the moment. I'll keep it in the back of my mind going forward though. It would be nice to get a plugin to read from Tachyon as a generic input/output stream. That could happen sooner.
I'd read about tachyon a while back and definitely had it on my list of things to check again later. I'm definitely interested though, as Michael says, it may take some time.
Just adding some ammunition here - I built a very fast and very successful distributed computational pipeline that heavily used Tachyon. I think to get the most out of Tachyon might involve some rearchitecting of Onyx.
Thanks @ohpauleez. We're unlikely to make a major architectural pivot as the streaming engine is performing well (and is a large investment), so we appreciate the data point.
We could probably do something similar to Flink and provide a tachyon input and output plugin, and or useful lifecycle calls that would allow peers to load data from tachyon as part of the usual task lifecycle.
With our new upcoming scheduler we could probably get even greater improvements ensuring we get some nice data locality properties by scheduling tasks requiring that data near where the data is stored in tachyon.
This is still not a priority for us, and we haven't seen the demand yet, but if anyone is interested enough I'd be happy to devote my time assisting with any questions and help where I can.
I would be pretty interesting in some built-in support for this. http://tachyon-project.org/