kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.49k stars 874 forks source link

Option to include visualized pipelines in the generated document #56

Closed Minyus closed 4 years ago

Minyus commented 4 years ago

Description

An option to include the image of visualized pipelines in the Sphinx document generated by kedro build-docs command

Context

kedro-viz offers kedro viz command that can generate interactive visualized pipelines.

This visualization is very useful to explain to the stakeholders and it is even nicer to automate the manual operation to run kedro viz command, access the URL, take a screenshot, and paste it in the document.

![image](https://user-images.githubusercontent.com/33908456/61184262-542f8000-a67e-11e9-872e-3fa40e31b81c.png)

Possible Implementation

Programmatically communicate with the kedro_viz.server.

Possible Alternatives

Use a graph visualization tool such as graphviz.

Pet3ris commented 4 years ago

Hi @Minyus, what sort of document would you like the image to be pasted in? Currently Kedro doesn't really produce any soft deliverables apart from perhaps the automated documentation. But providing the visualization there feels like a special case.

Minyus commented 4 years ago

Hi @Pet3ris, it would be awesome if a visualized pipeline image file can be included in the HTML document automatically generated by kedro build-docs command.

Alternatively, would it be possible to output the pipeline graph data used by Kedro-Viz so that users can visualize the pipeline using common graph visualization tools such as GraphViz and Gephi?

idanov commented 4 years ago

Hi @Minyus, this sounds like a very interesting feature and I agree that it can be useful to add the pipeline graph to the docs. Since this feature request is more of a request to the kernel-viz plugin, rather than Kedro, maybe we can have the feature request posted https://github.com/quantumblacklabs/kedro-viz instead.

The implementation of this feature may prove to be quite involved, since it will require running the kedro-viz frontend in a headless browser and then take a screenshot there, which could be used for the docs. Given that this feature is nice to have and not very important for the user's workflow, but quite high on the effort, I will put it as low priority.

Pet3ris commented 4 years ago

@Minyus you can actually access the graph representation. kedro uses a hidden folder called .log or similar when sharing graph outputs with the graphviz so to answer your second question that's definitely possible.

Thoughts @idanov?

Minyus commented 4 years ago

@Pet3ris Where can we find the hidden folder? Can we use it without running kedro viz command?

tolomea commented 4 years ago

@Pet3ris that doesn't exist anymore @Minyus If you'd like to see how Viz extracts the graph info from Kedro you can look here https://github.com/quantumblacklabs/kedro-viz/blob/develop/package/kedro_viz/server.py#L53

Minyus commented 4 years ago

@tolomea Noted. Thank you.

lorenabalan commented 4 years ago

Hi @Minyus, I am going to close this issue as I have moved it under the kedro-viz umbrella - please see https://github.com/quantumblacklabs/kedro-viz/issues/31.