Closed ffarhour closed 8 years ago
You do well to write documentation for what you've implemented. You would do even better if you marked up that documentation in the canonical documentation format. :) docstrings and pydoc are what you want to look at. Just use Google's style guide:
I think I'd like to see three things before I can fully evaluate this:
readme.md
that says how to use this scriptjson2vega.py
that produces a graph. readme.md
presumably would reference this script as its example. json2vega.py
. Everything I see looks good, but I will actually be able to evaluate it if I look at how I as a programmer will call it to make a graph, and that's not clear in this checkin.
Cool, thanks :) I will get on those right away
@jowens I've completed most of the things you asked:
I have not yet had the time to finish the complete implementation of all plot types, but once I do I will create a standalone script that builds all the graphs your code made.
Spaces. Only spaces. No tabs. Never tabs. http://legacy.python.org/dev/peps/pep-0008/#tabs-or-spaces
oh, haha! had no idea. All fixed now
OK. Tried this this morning. FWIW, I had to set an environment variable to run test_json2vega.py
so that it could find vega-lite: NODE_PATH="/Users/jowens/Documents/working/vega-lite" python test_json2vega.py
. (I'm just writing it here for posterity.)
When I run this, I don't know what it does. It should probably say what it does, e.g. Creating output/_g_BFS_0.json
. Instead it prints out a bunch of stuff that is basically comments. Don't print comments. Don't print any of that unless you run it with -h
.
OK, so then I have a .json file. How do I view it? Might be useful to also write a .html file along with it, or at least make that a command-line option. (But I pasted it into the vega editor so I could see it.)
The graph looks nice. However, weirdly, it has a lot of duplicate entries in the vega source. That's probably not good. Perhaps have the script take the latest-date value, rather than all values?
Notes on json2vega.py:
# Create required arguments ...
.# Create required arguments ...
-- comment liberally. For this, as an example, make it clear what's going on.bar1
and a bar
variable separately? Will the reader always call read_json()
then parse_jsons()
then write_json()
in succession exactly in that way? If so, maybe wrap that all into a function. What do you think?VegaGraphBar
), python lets you use named arguments. It is not clear when I look at the VegaGraphBar
call what those arguments mean. But when you say VegaGraphBar(output_path=args.o, input_path=args.i, config_path="config_files/", ...)
, it's a lot clearer. config_path
be config_dir
and have no slash? config_dir='config_files'
?x
and y
. The other is to set the actual text that is shown on the axis label (e.g., x is 'Dataset', y is 'MTEPS'). By default the second should be set to the first, but it needs to be possible to set them separately. (This is particularly important when the desired axis label has spaces in the name, since that's not going to be in the field names in the JSON.)This is great overall. I hope these comments are helpful. I'm gonna merge it, and you can either keep filing merge requests (and I will review them) or just push directly into the repo, your call. @yzhwang @sgpyc I'd like your eyes on this too, please.
@jowens thanks for the detailed comments and suggestions. I will spend the weekend working on these.
Sorry for the delay @ffarhour , I will try this today and provide my comments.
@ffarhour I don't have anything to add to John's comments. I think the most important thing is the flexibility to use latest-date value, or specify values to use. Also, I like John's heatmap for DOBFS a lot, does the current version support that too? (If not it's OK, not a huge priority)
BTW, when I run test_json2vega.py on luigi, I got an error:
Traceback (most recent call last):
File "test_json2vega.py", line 85, in
Is this because I missed some dependencies? It would also be nice to specify all the dependencies it needs.
Anyway, thank you for doing this @ffarhour . Good job! Looking forward to seeing the final version of this and trying it out on some of our performance characterization works!
@yzhwang check your python-pandas version for the to_dict call. I've got 0.18 on my machine, which appears to work.
Thanks @jowens ! After I upgraded pandas on luigi, it could run. But it took me longer than I expect to finally get the result since I had to go through a lot of efforts to install node.js, d3 and vega-lite. @ffarhour I think you should imply these as dependencies for running gunrock.io. Also, as @jowens said, the output is a json file, which as a user who never tried vega-lite before as me, I don't know how to visualize it. So probably make the output as a vector graph in pdf or svg format?
Agreed on listing the dependencies.
I think the best thing the script should do to visualize the output is also to output an html file that wraps the JSON file (as my old script did). @yzhwang in the meantime you can paste the JSON into the vega-lite editor (that's what I did).
Hi all. Sorry for the late reply, I think I'm coming down with a cold. Anyways, thanks for the comments. I will work on these and report back soon. @yzhwang at the moment the code doesn't support the heatmap, but I'll integrate it once I fix everything else. @jowens @yzhwang what's the best way to list dependencies? I know that I can use setuptools' "install_requires", but I'm not sure if that will be helpful for non-python packages such as nodejs. Also, do you think it would be worth having a function to automatically generate the png files using vega-lite?
Just put the list of dependences in the readme (and maybe in the usage() call in the script).
I think generating a PDF is much more useful than a PNG, so I would be in favor of PDF output.
Thanks :) Of course, PDF would be much better, will look into that.
I've fixed most of the things you asked. I am planning to focus on adding more plot types next. Changes made:
run
function implemented to do the read, parse, write automatically, without the need for user interferenceOnce I have more implemented more plot types I will submit another pull request
Great! Looking forward to reading it. These are all good improvements.
This is my work so far (mostly done in the past week). @jowens @yzhwang @huanzhang12 I wanted to get some feedback on the quality of my code, and any suggestions on how to improve.
The main file to look at is json2vega.py which takes in json outputs in a specific folder and spits out a graph based on your preferences.
Right now I haven't finished the commandline arguments, but have hardcoded a test in the main function.
Summary of what the code does and how it is structured:
Thanks in advance for your feedback!