Open pavolloffay opened 5 years ago
As a first step, we should gather people interested in this to drive the right decisions. cc @jaegertracing/jaeger-maintainers
Secondly, we should start working on the integration to make it easy to start writing models. It seems there are two main WEB based platforms: jupyter and zeppelin. Both have pros and cons:
Jupyter
+
bigger community, older project+
direct integration with spark+
supports multiple kernels - possibility to write multiple languages.-
no direct integration with flink - possible workaround with java/scala kernel https://github.com/jupyter/jupyter/wiki/Jupyter-kernels https://groups.google.com/forum/#!topic/jupyter/hcibYIVbmukZeppelin
+
direct integration with flink+
better support for Scala/Java-
smaller community, fewer integrations https://flink.apache.org/ecosystem.htmlIf you are interested comment on this issue or send me PM on gitter and I will add you to https://github.com/orgs/jaegertracing/teams/data-analytics
Our next steps could be to try Jupyter with java/scala kernel and make connections to our DB/kafka.
After a discussion with Pavol, we have decided to work top-down by first compiling a list of high level objectives that we want to achieve using the AI/ML analysis. We could then gauge interest in the community about the most helpful features, and prioritise accordingly.
Once our targets are clear, not only will it help us define a clear path for development but also encourage contribution from folks with more knowledge on building data analysis models.
cc @jaegertracing/data-analytics
Anybody is welcome to propose/upvote any feature which would help us with this initiative.
The objectives from the initial comments still hold. First we would like to build a community of people who would like to contribute (models, integrations), validate models. Secondly provide AI/ML integration as part of the upstream project. This should ultimately result in new features added to Jaeger main distributions and UI interface.
To be able to start working on the models we should provide an environment to do that. Specifically I am talking about Jupyter notebook integration with Jaeger. Provide a notebook with spark/flink connected to Jaeger data storages.
@yurishkuro also proposed to create graph query language (similar to canopy's capabilities) which would allow defining graph related queries.
https://research.fb.com/publications/canopy-end-to-end-performance-tracing-at-scale
I would like to hear @jaegertracing/data-analytics opinion on which language they would like to use for data analytics with Jaeger. Would it be Java or python?
I would start with Python, it is the de-facto DS/ML language. We can later extend it to Java if necessary.
I would like to hear @jaegertracing/data-analytics opinion on which language they would like to use for data analytics with Jaeger. Would it be Java or python?
I prefer Python because it's easier to get off the ground, also as @yurishkuro mentioned, it provides good libraries for DS/ML purpose.
To be able to start working on the models we should provide an environment to do that. Specifically I am talking about Jupyter notebook integration with Jaeger. Provide a notebook with spark/flink connected to Jaeger data storages.
I would be interested in starting with the Jupyter integration with Jaeger. Maybe once we have this in place, gathering requirements for building models could be easier. What do you suggest @pavolloffay?
Ack for python, so let's start this :).
@Talina06 this is great. I will try to summarize requirements I can think of:
run the jupyter/jupyterlab as docker container
have a notebook file with basic connector to the storage - e.g. Elasticsearch or do streaming with Kafka.
the connector might depend on the framework we choose - we could start with spark or flink (both support python). I would like to also hear what people prefer here.
Ack for python, so let's start this :).
@Talina06 this is great. I will try to summarize requirements I can think of:
- run the jupyter/jupyterlab as docker cotainer
- have a notebook file with basic connector to the storage - e.g. Elasticsearch or do streaming with Kafka.
- the connector might depend on the framework we choose - we could start with spark or flink (both support python). I would like to also hear what people prefer here.
Sounds good. Let me get started with Spark in the meantime and share an update here.
- the connector might depend on the framework we choose - we could start with spark or flink (both support python). I would like to also hear what people prefer here.
@pavolloffay IMO, we should start working on connector for spark. Spark has a larger community base and is more widely used tool for data analysis..
I would say we should start with the library/DSL for writing query and analysis, not with Spark integration. A library is useful on its own as there could be many different sources of traces it would work with in Jupyter, like loading from a file or from query-service.
We can start simultaneously with both. Both could be useful for different use-cases.
I have created a separate issue for DSL https://github.com/jaegertracing/jaeger/issues/1811.
Issue for jupyter notebook https://github.com/jaegertracing/jaeger/issues/1813 - cc) @Talina06
I replied in https://github.com/jaegertracing/jaeger/issues/1811#issuecomment-534848073
We don't need a full blown DSL, just a data model of a trace as a graph. Once we have that, people can start writing jupyter scripts.
We should move our protos to ild repository and allow building to other languages:https://github.com/jaegertracing/jaeger/issues/1213. That will be required to consume data from Kafka.
It would also help if the compiled model classes were published as artifacts. Now if somebody wants to consume data from kafka it requires a lot of additional work to be done.
I have moved my POC with trace DSL using gremlin and packaged in jupyter notebook to https://github.com/jaegertracing/jaeger-analytics-java.
I think that i can work in a MongoDB backend and to apply some AI with scikit learn or tensorFlow with collected data
Summary
At the moment doing ML/AI analysis with Jaeger is hard. There is no direct integration with ML/AI platforms and we do not have much knowledge on what models we could build.
Proposal
Placeholder issue for any discussion related to ML/AI integration with Jaeger. On the recent Jaeger bi-weekly meetings we have talked about doing ML/AI on tracing data (and also with combination with other telemetry data like metrics and logs).
For the completion, I will list existing ML/post-processing integrations:
cc) @annanay25