Discussion about summary and proposed more wide-sweeping changes.

Mark-Simulacrum commented 8 years ago

I've given this a little more thought, and I think it may be a good idea to change our approach to summarizing/displaying data; below is what I believe we should have.

Summarization is week-based
We diff the first run of each week with the last available run for that week.
- This makes it so that ~6 hours after introduction of a change to the benchmarks we should see the updated data.
Add a configuration/log file to rustc-benchmarks, where (any) changes to the benchmarks which impact compile times will be documented.
- Render this file on perf.rlo
- Have a syntax something like JSON wherein a log of events can be created; each event has a name. For example, if we are updating a crate, we would add { "event": "update", from: "crate-name-v0.0.1", to: "crate-name-v0.0.2" }. This would be parsed by the backend to mean that the v0.0.1 crate is deprecated, and shouldn't be shown anywhere we show lists of crates.
- More events could be added in the future.
- Once a crate has been deprecated, files in processed/{deprecated crate id} would no longer be loaded.

The above would allow graceful removal of crates. More events could be added, though perhaps it would be easier to have only two events (add and remove) with an additional field "reason" that could then link to the issue and concisely discuss why the change was made.

Once we have this system in place, the server would log any problems during loading (for example, empty files) for crates which are not marked as removed. I propose that every June we would move all data from the previous year to the archive. This would prevent having too little data while keeping the overall data size small and subsequently faster to load. The date of data deprecation is chosen to be in the middle of the year, but it ultimately doesn't matter when that happens, or even if it's regular.

@nrc, @nikomatsakis, @nnethercote: What do you think? If I'm given the go-ahead I can start implementing this.

nikomatsakis commented 8 years ago

Seems reasonable to me.

nrc commented 8 years ago

SGTM too

nnethercote commented 8 years ago

We discussed this on IRC. It sounds good to me, and "add" and "remove" seem sufficient. Any change to a benchmark -- either its contents or the way it is measured -- would be treated as removing the old benchmark and adding a new one. We could add numeric suffixes for small changes. For example, in https://github.com/rust-lang-nursery/rustc-benchmarks/pull/20 I fixed the touch target for rust-encoding-0.3.0, which changed what is being measured. With the new scheme this would merit a "remove"/"add" pair, with the newly added benchmark being called something like rust-encoding-0.3.0-1.

rust-lang / rustc-perf

Discussion about summary and proposed more wide-sweeping changes. #120