microsoft / ADBench

Benchmarking various AD tools.
MIT License
107 stars 39 forks source link

Collecting plot data [part of #180] #183

Closed mikhailnikolaev closed 4 years ago

mikhailnikolaev commented 4 years ago

Plot script now collects plot data and then uses it for creating graphs. Also, it saves plot data to the JSON file.

This PR contains a part of changes from the closed PR #180. It is the second part of the list, described here https://github.com/awf/ADBench/pull/180#issuecomment-569066919

mikhailnikolaev commented 4 years ago

This looks great, but is there any way that the same data structure can be used for both serialisation and generating the graph? Is it possible to use a singled named tuple and write that out as JSON?

Will it be more convenient if the plot_data,json file has the following form:

[
    {
        "figure_info": {
            "build": build,
            "objective": objective,
            "function_type": function_type,
            "test_size": test_size
        },
        "values": [ ... ]
    }
]

Thus the script can use figure_info subdictionary instead of named tuple

tomjaguarpaw commented 4 years ago

Yes, I think that would clear up the biggest differences. It would be good if we could avoid preprocessing sorted_vals_by_tool too.

mikhailnikolaev commented 4 years ago

Yes, I think that would clear up the biggest differences. It would be good if we could avoid preprocessing sorted_vals_by_tool too.

Actually, I have the idea to use only items of the plot_data array everywhere in the script. I mean, we can even avoid such a term as sorted_vals_by_tool and use only dictionaries of the form

{ "figure_info": { ... }, "values": { ... } }

where "values" is a kind of sorted_vals_by_tools and 'figure_info" is figure_info. Thus, we can avoid any additional serialization actions except json.dump. But I don't know is it a good idea or there is a more pythonic way. What do you think?

tomjaguarpaw commented 4 years ago

Ideally the form that we store as JSON would be as close as possible to the form that is used by generate_graph. It seems like it would be nice not to unwrap and rewrap sorted_vals_by_tool. What do you think?

tomjaguarpaw commented 4 years ago

OK, I think your idea sounds good. Let's give it a go!

mikhailnikolaev commented 4 years ago

Ideally the form that we store as JSON would be as close as possible to the form that is used by generate_graph. It seems like it would be nice not to unwrap and rewrap sorted_vals_by_tool. What do you think?

This will be one of the positive corollaries of using dictionary everywhere instead of sorted_vals_by_tools

mikhailnikolaev commented 4 years ago

OK, I think your idea sounds good. Let's give it a go!

Should I do another PR, or I can add commits into this one?

tomjaguarpaw commented 4 years ago

Should I do another PR, or I can add commits into this one?

I think best to add commits into this one and see what it looks like. Then we can decide what to do.

tomjaguarpaw commented 4 years ago

I think best to add commits into this one and see what it looks like. Then we can decide what to do.

In fact, you can do whichever you prefer.

tomjaguarpaw commented 4 years ago

This is improving a lot.