gunrock / io

Input (scripts, etc.) and output (scripts, performance results, etc.) for Gunrock and other graph engines
10 stars 6 forks source link

Generating vega-spec bar graphs #1

Closed ffarhour closed 8 years ago

ffarhour commented 8 years ago

This is my work so far (mostly done in the past week). @jowens @yzhwang @huanzhang12 I wanted to get some feedback on the quality of my code, and any suggestions on how to improve.

The main file to look at is json2vega.py which takes in json outputs in a specific folder and spits out a graph based on your preferences.

Right now I haven't finished the commandline arguments, but have hardcoded a test in the main function.

Summary of what the code does and how it is structured:

Thanks in advance for your feedback!

jowens commented 8 years ago

You do well to write documentation for what you've implemented. You would do even better if you marked up that documentation in the canonical documentation format. :) docstrings and pydoc are what you want to look at. Just use Google's style guide:

https://google.github.io/styleguide/pyguide.html#Comments

jowens commented 8 years ago

I think I'd like to see three things before I can fully evaluate this:

Everything I see looks good, but I will actually be able to evaluate it if I look at how I as a programmer will call it to make a graph, and that's not clear in this checkin.

ffarhour commented 8 years ago

Cool, thanks :) I will get on those right away

ffarhour commented 8 years ago

@jowens I've completed most of the things you asked:

I have not yet had the time to finish the complete implementation of all plot types, but once I do I will create a standalone script that builds all the graphs your code made.

jowens commented 8 years ago

Spaces. Only spaces. No tabs. Never tabs. http://legacy.python.org/dev/peps/pep-0008/#tabs-or-spaces

ffarhour commented 8 years ago

oh, haha! had no idea. All fixed now

jowens commented 8 years ago

OK. Tried this this morning. FWIW, I had to set an environment variable to run test_json2vega.py so that it could find vega-lite: NODE_PATH="/Users/jowens/Documents/working/vega-lite" python test_json2vega.py. (I'm just writing it here for posterity.)

When I run this, I don't know what it does. It should probably say what it does, e.g. Creating output/_g_BFS_0.json. Instead it prints out a bunch of stuff that is basically comments. Don't print comments. Don't print any of that unless you run it with -h.

OK, so then I have a .json file. How do I view it? Might be useful to also write a .html file along with it, or at least make that a command-line option. (But I pasted it into the vega editor so I could see it.)

The graph looks nice. However, weirdly, it has a lot of duplicate entries in the vega source. That's probably not good. Perhaps have the script take the latest-date value, rather than all values?

Notes on json2vega.py:

This is great overall. I hope these comments are helpful. I'm gonna merge it, and you can either keep filing merge requests (and I will review them) or just push directly into the repo, your call. @yzhwang @sgpyc I'd like your eyes on this too, please.

ffarhour commented 8 years ago

@jowens thanks for the detailed comments and suggestions. I will spend the weekend working on these.

yzhwang commented 8 years ago

Sorry for the delay @ffarhour , I will try this today and provide my comments.

yzhwang commented 8 years ago

@ffarhour I don't have anything to add to John's comments. I think the most important thing is the flexibility to use latest-date value, or specify values to use. Also, I like John's heatmap for DOBFS a lot, does the current version support that too? (If not it's OK, not a huge priority)

BTW, when I run test_json2vega.py on luigi, I got an error: Traceback (most recent call last): File "test_json2vega.py", line 85, in main(sys.argv) File "test_json2vega.py", line 80, in main bar = bar1.parse_jsons() File "/home/yzhwang/Projects/io/scripts/json2vega.py", line 159, in parse_jsons bar["data"] = {"values":df_restricted.to_dict(orient='records')} TypeError: to_dict() got an unexpected keyword argument 'orient'

Is this because I missed some dependencies? It would also be nice to specify all the dependencies it needs.

Anyway, thank you for doing this @ffarhour . Good job! Looking forward to seeing the final version of this and trying it out on some of our performance characterization works!

jowens commented 8 years ago

@yzhwang check your python-pandas version for the to_dict call. I've got 0.18 on my machine, which appears to work.

yzhwang commented 8 years ago

Thanks @jowens ! After I upgraded pandas on luigi, it could run. But it took me longer than I expect to finally get the result since I had to go through a lot of efforts to install node.js, d3 and vega-lite. @ffarhour I think you should imply these as dependencies for running gunrock.io. Also, as @jowens said, the output is a json file, which as a user who never tried vega-lite before as me, I don't know how to visualize it. So probably make the output as a vector graph in pdf or svg format?

jowens commented 8 years ago

Agreed on listing the dependencies.

I think the best thing the script should do to visualize the output is also to output an html file that wraps the JSON file (as my old script did). @yzhwang in the meantime you can paste the JSON into the vega-lite editor (that's what I did).

ffarhour commented 8 years ago

Hi all. Sorry for the late reply, I think I'm coming down with a cold. Anyways, thanks for the comments. I will work on these and report back soon. @yzhwang at the moment the code doesn't support the heatmap, but I'll integrate it once I fix everything else. @jowens @yzhwang what's the best way to list dependencies? I know that I can use setuptools' "install_requires", but I'm not sure if that will be helpful for non-python packages such as nodejs. Also, do you think it would be worth having a function to automatically generate the png files using vega-lite?

jowens commented 8 years ago

Just put the list of dependences in the readme (and maybe in the usage() call in the script).

I think generating a PDF is much more useful than a PNG, so I would be in favor of PDF output.

ffarhour commented 8 years ago

Thanks :) Of course, PDF would be much better, will look into that.

ffarhour commented 8 years ago

I've fixed most of the things you asked. I am planning to focus on adding more plot types next. Changes made:

Once I have more implemented more plot types I will submit another pull request

jowens commented 8 years ago

Great! Looking forward to reading it. These are all good improvements.