Open Anton-4 opened 5 years ago
@Anton-4 this looks really interesting, I'd like to know more. Feel free to contact me via email
Great, I will get back to you later this week.
@Anton-4 sorry, I was kind of busy these last weeks, I moved and started a new job. I'd still be interested in learning more about your visualization, if you find the time for a chat about it, let me know.
Sorry for the late response, I've been very busy as well. In the coming weeks I will try to set up some code in a repository so you can see how I currently do things for the visualization. Then we can discuss the code and move on from there.
Super cool! Thanks @Anton-4
I have currently uploaded the code to a private repository. Even though I wrote the code, it is legally owned by the company I work at so I just need to ask for permission to open source it. I think it is highly unlikely they will say no. When I get the ok, I will give you access as well.
I got permission to open source the code. I am still working on a nice example graph that is not based on code we used for a customer. I expect to be done with this some time next week.
Amazing! Thanks a lot @Anton-4
I am nearly done with the example. I could have easily created a mockup but I think a real and fully interactive example were the node outputs can be examined truly shows the benefits of the visualization.
Hi @Anton-4 , this looks quite amazing, was wondering how we could potentially move forward with this cause I'm planning to spend some time on flowpipe over the next few weeks. No rush though, just wanted to get the conversation started again
Hi Paul, I wanted to add an actual Gaussian Process model to the graph, then I can upload all the visualization code. I wanted to use a GP because the graph is more complex in that case whereas a Random Forest for example would just need a waterfall pipeline, not a real graph. I'll try to see if I can find some time this month to finish this up.
Sounds amazing, thanks @Anton-4 !!
Hi @Anton-4, can you explain how you developed the graphs shown in your previous comments? I would really like to make such a graph for my code.
@Anton-4, any update on a demo? We currently have such enormous graphs, they don't fit into a console window. I'm eager create more readable graphs like you showed.
@Onandon11 today I sent an email to my previous employer asking their permission. As soon as I get the ok I will get to work on making the code available.
I got the green light to open source the code. It does have to be a repository under the ml2grow username with APACHE 2.0 license.
I have finished the demo and will look into deploying it. I will also make a proper readme and then I am done I think. I will post a link here to the repo once it is up.
Amazing! Thanks @Anton-4
I am looking forward to it, too. This will make communicating how our processing flow looks like much easier -- no more fiddling the graph together with inkscape :+1:
I've asked my former boss for a review, afterwards the code will be available :) . I have not deployed the demo yet, I would like to use github pages but for that I will need to make a static version of the demo which will require some extra work.
The code is available here.
Graphs are imported using json, the current format is different from what is produced with the to_json
defined in flowpipe. I created this format to give me more freedom for visualization.
I would like to change the visualization graph format to something more commonly used, I really like JGF as a first step. It is also very similar to the current format.
GraphML is maybe the most popular. With #32 we set out to serialize to and from GraphML, maybe we should only add a simplified export to GraphML and no import? Full serialization to GraphML does not seem that wise -also discussed in #80- and would also be a lot of work if we truly want to comply with the standard.
Let me know what you think.
@Anton-4 I agree that it would make sense to distinguish formats for storing and restoring the graph, such as the pickle and json formats we have so far, and formats for visualization. They are very different tasks, and finding or constructing a data structure that allows for both uses will be harder than to find a good format for each task in isolation.
Is GraphML still used? The official website lists 2007 as news.
@Anton-4 , I gave the visualization a try yesterday, very nice work, this is indeed quite helpful! As soon as time allows I am going to delve into it deeper. Thanks again for making it public and putting in the effort.
graphml has the advantage that it can already be read by existing graph visualizers (yEd for example).. Also @Anton-4 already put in effort which we can build upon.
That being said I think we should implement some form of visualization interface that allows users to develop any kind of visualization, like @Anton-4 s solution.
Something along the lines of this (naive) demo code:
flowpipe/visualizer.py:
class IVisualizer():
def visualize(self, graph):
"""Do 'something' to instantly visualize the given graph."""
raise NotImplementedError
def export_for_visualizer(self, graph, path)
"""Export the graph for an external visualizer."""
raise NotImplementedError
VISUALIZERS = {}
def register_visualizer(visualizer):
VISUALIZERS[visualizer.__class__.__name__] = visualizer
class AsciiVisualizer(IVisualizer):
def visualize(self, graph):
print(graph)
register_visualizer(AsciiVisualizer())
graphml_visualizer.py:
from flowpipe.visualizer import register_visualizer
class GraphmlVisualizer(IVisualizer):
def visualize(self, graph):
tmp = "/tmp/flowpipe.graphml"
self.export_for_visualizer(graph, tmp)
subprocess.call("an_existing-graph-application {0}".format(tmp))
def export_for_visualizer(self, graph, path)
"""Convert graph to graphml file."""
register_visualizer(GraphmlVisualizer())
This could then be used like this:
g = MyGraph()
# Just prints the graph
VISUALIZERS["AsciiVisualizer"].visualize(g)
# Saves the graph for further debugging
VISUALIZERS["GraphmlVisualizer"].export_for_visualizer(g, "/desktop/my-debug-file.graphml")
We could implement some common visualizers directly in flowpipe while referencing other available visualizers in the readme/docs.
What do you think?
TBH, I would prefer a simple and clean export function to commonly used formats. Adding support for registering and calling visualizers seems a little much - I expect at most one visualizer actually being used per application.
I agree with @neuneck here. I am totally up for doing the implementation.
Ok, great, I agree, just having the export functionality is enough, after all, we want to keep flowpipe focussed and clean, thanks for your opinions! @Anton-4 , thanks for offering your help, please go ahead with the implementation.
So just to get everyone on the same page, we're going to continue on this ticket on graphml serialization, right?: https://github.com/PaulSchweizer/flowpipe/issues/32 And we ignore my proposal on the visualizer interface.
We @ml2grow have created a graph visualization and inspection webapp for flowpipe graphs: Node names were hidden for legal reasons.
Node ouputs can be inspected:
Some useful info is shown in the edge label for every output. Edge labels can be clicked to view the computed output. Subgraphs are labeled. Graph structure is saved in json format, outputs are stored as markdown files for easy viewing and rendering in the browser. All this personally helped me a lot when debugging.
Let me now if you ever want to discuss integrating this visualization into flowpipe itself.