PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.82k stars 1.55k forks source link

Allow flow.visualise to other file formats besides PDF #2447

Closed chuehsien closed 4 years ago

chuehsien commented 4 years ago

Current behavior

Currently, flow.visualise can be used to see how the graph is constructed (super helpful). I also use it to look at where errors occur. However, there is no option to save in another format other than PDF.

Proposed behavior

To add an argument that allows saving to any formats allowable by graphviz.

Example

flow.visualize(filename="myflow", format="jpeg")

chuehsien commented 4 years ago

Hi all, I'm trying to write tests for the PR, what is the recommended approach to determining the set of allowable formats?

I used graphviz.FORMATS but there are invalid formats inside. The FORMATS in python-graphviz differ from the valid formats in the graphviz 2.38 windows binary, e.g. jp2 is in graphviz.FORMATS but is rejected by the native binary.

Should i store the intersection of the native binary and the python library in a constant somewhere? Doesn't feel right to do so..

joshmeek commented 4 years ago

@chuehsien Hm interesting. Personally any of the "standard" image formats would be enough for me (png, jpg, etc.). When you say you have invalid formats what exactly does that mean? Are there errors thrown when you try to write to certain file formats?

chuehsien commented 4 years ago

graphviz.FORMATS has the following: ['bmp', 'canon', 'cgimage', 'cmap', 'cmapx', 'cmapx_np', 'dot', 'dot_json', 'eps', 'exr', 'fig', 'gd', 'gd2', 'gif', 'gtk', 'gv', 'ico', 'imap', 'imap_np', 'ismap', 'jp2', 'jpe', 'jpeg', 'jpg', 'json', 'json0', 'pct', 'pdf', 'pic', 'pict', 'plain', 'plain-ext', 'png', 'pov', 'ps', 'ps2', 'psd', 'sgi', 'svg', 'svgz', 'tga', 'tif', 'tiff', 'tk', 'vml', 'vmlz', 'vrml', 'wbmp', 'webp', 'x11', 'xdot', 'xdot1.2', 'xdot1.4', 'xdot_json', 'xlib']

Many of the formats cause errors in the underlying graphviz native library: Command '['dot', '-Tcgimage', '-O', 'viz']' returned non-zero exit status 1. [stderr: b'Format: "cgimage" not recognized. Use one of: bmp canon cmap cmapx cmapx_np dot emf emfplus eps fig gd gd2 gif gv imap imap_np ismap jpe jpeg jpg metafile pdf pic plain plain-ext png pov ps ps2 svg svgz tif tiff tk vml vmlz vrml wbmp xdot xdot1.2 xdot1.4\r\n'] Note that the list returned in the exception is shorter than the full list provided in graphviz.FORMATS.

Only the following list runs without any exceptions: ['vml', 'png', 'tk', 'cmapx', 'xdot1.4', 'gv', 'cmapx_np', 'wbmp', 'jpg', 'jpeg', 'jpe', 'ismap', 'canon', 'ps2', 'vmlz', 'xdot1.2', 'gd', 'dot', 'xdot', 'pov', 'tif', 'imap', 'ps', 'eps', 'tiff', 'fig', 'svg', 'gd2', 'bmp', 'pic', 'cmap', 'gif', 'vrml', 'imap_np', 'pdf', 'plain', 'plain-ext', 'svgz']

And the remaining fail (all with the same exception): ['exr', 'dot_json', 'gtk', 'x11', 'psd', 'webp', 'json0', 'cgimage', 'xdot_json', 'ico', 'tga', 'jp2', 'json', 'pct', 'pict', 'xlib', 'sgi']

So where should we define the list of formats that have been tested and working?

joshmeek commented 4 years ago

Hey @chuehsien I'm wondering if we actually need to store this list anywhere outside of in the tests themselves as I don't see this being a responsibility of the .visualize command. Personally I would say a few tests using "standard" image formats would suffice.