Open HolzmanoLagrene opened 1 year ago
Looks like an interesting script! Just in case you hadn't seen it, we have something similar in https://github.com/google/turbinia/blob/master/tools/turbinia_job_graph.py
Is this something that would be helpful to be in the API server, or is having the job graph script enough, or are there things we could add to that rather than adding it into the API server?
Another somewhat related feature request is to get this same graph for a given request after it has completed which would require tracking the same flow to understand more easily how things were processed.
Regarding getting the task config for each task given the evidence type: I think the part that is missing in order to do that is the Job -> Task mapping, which is currently done in each Jobs create_tasks
method, so it doesn't have a static mapping for the Task types. We could potentially add another attribute similar to evidence_input
and evidence_output
though, and potentially even refactor out most of the create_tasks
methods altogether. That being said, it should be easy to enumerate all tasks and their task config variables if that would be useful.
Yes the possibility to get a graph of the Jobs that are going to be run would indeed be very interesting to me. Maybe it'll help if i describe the intended use case:
My idea is to get the Jobs
that could possibly be run based on the Evidence
-Type. As Jobs trigger other Jobs
based on their Output
-Types it is not always clear from the beginning what can be done in the first place. E.g. if I want to search to run a Grep
-Job, this can only be done if a PlasoFile
-Output is generated. This type however is only created if i run the Plaso
-Job in the first place...If I know beforehand what Jobs will be run, I can provide them with the appropriate parameters to do what I want.
To do this, getting a graph that shows me the dependencies between Evidence-Types, Jobs and Tasks is the first step. The second step would be to get the parameters for each Task.
So in a nutshell I would love to have accessible through the API:
How should we proceed regarding both ideas?
Would a static but regenerate-able representation of this data be OK instead of putting this into the API server? Ie. if we were to update https://github.com/google/turbinia/blob/master/tools/turbinia_job_graph.py to include .json output and the task configs, would that be good enough to meet your needs for this?
It would be more than I was hoping for 🎉☺️
How is the status on this? Did anyone have the time to look into this yet?
@HolzmanoLagrene Sorry, I haven't gotten a chance to do that yet, but I'll try to carve out some time sometime soon.
I have two feature requests for the API
Allow to fetch a graph for how the Jobs and Evidences are connected.
At the moment i have no clue what kind of Jobs actually will get started if I process an evidence with a set of Jobs. It would be really cool to be able to know beforehand what Jobs will get triggered based on the Evidence-Type and the initial Jobs selected.
For now i do this with some sort of twisted reflection and create a graph to get an impression of what is going to happen:
A visiual representation looks something like this:
Allow to fetch the Task Config for each Task
As it is possible for each evidence type to fetch the needed and possible parameters it would be amazing to be able to fetch the possible task parameters for each Task.