maki-nage / makinage

Stream Processing Made Easy
https://www.makinage.org
MIT License
38 stars 1 forks source link

ML - online and offline modes #11

Open j7zAhU opened 1 year ago

j7zAhU commented 1 year ago

Hello,

I have been looking into MN to see whether it is appropriate to my use case.

I have microsecond log data which will be used as an input to a ML classifier. I would like to use the same code when batch processing historical data as I do when the classifier is running live. The event stream system in use is proprietary.

Is MN suitable? Many thanks :)

MainRo commented 1 year ago

Yes, one of the main goals of Maki-Nage is to mutualize as much code as possible for stream and batch processing (and this is how we use it). You may have seen that for now, the Maki-Nage package focuses mainly on the streaming use-case, and more precisely on Kafka. However, the connector API can be used to plug virtually any source of data.

That being said, Maki-Nage is still in an early stage and you should be aware of this before using it in production use-case:

However, I obviously encourage you to give it a try and see if it may fit your needs. We are interested in any feedback. We typically use it as Kafka micro-services and Kubeflow pipelines components.

Also, if you are ready to use the foundation of maki-nage, you can write your own application/library directly with rxsci. The advantage of this is that for batch processing, you can parallelize your processing via ray (see rxray). The aim is to integrate ray in a seamless way into Maki-Nage but we are still far from it.

If you need an already mature solution, then apache beam is undoubtedly a solution to consider.

j7zAhU commented 1 year ago

Thank you kindly. I will investigate these suggestions.