catapult-project / catapult

Deprecated Catapult GitHub. Please instead use http://crbug.com "Speed>Benchmarks" component for bugs and https://chromium.googlesource.com/catapult for downloading and editing source code..
https://chromium.googlesource.com/catapult
BSD 3-Clause "New" or "Revised" License
1.93k stars 564 forks source link

Make the raw data on Chromeperf available via BigTable or similar for deeper analysis #3222

Closed natorion closed 7 years ago

natorion commented 7 years ago

Chromeperf has great data stored. Currently, when I want to do some analysis I need to individually open all the necessary benchmarks and bots. Crosschecking needs to be done manually. Simple use cases taken from actual day-to-day work:

1.) I wonder how much we improved on memory consumption after the fix has landed? Would be good to know for a blog post like https://v8project.blogspot.de/2016/10/fall-cleaning-optimizing-v8-memory.html.

With as simplistic query like (names made up btw)

SELECT min(avg_v8_heap), max(avg_v8_heap) FROM all_data WHERE test_suite="v8" AND benchmark="mobile_browsing" AND sub_test="NY times" AND revision="123456";

I could answer that question easily.

2.) We are currently tweaking the bucketting for the Runtime Callstats. The assumption is that a certain percentage of the work in one bucket, should show up in another with the "equal" (including error) amount.

Example (let's simply use a median over all the bots for good practice)

SELECT median(IC_Total), median(GC_Total) from all_data WHERE test_suite="v8" AND benchmark="top25" AND revision="123456"; //before change SELECT median(IC_Total), median(GC_Total) from all_data WHERE test_suite="v8" AND benchmark="top25" AND revision="123457"; //after change

Obviously this would also mean, that one can use myriad of charting/analysis tools to have a deeper, and more interactive look.

anniesullie commented 7 years ago

@simonhatch and @eakuefner have been looking into the dashboard backend lately. Some ideas on how this could work:

simonhatch commented 7 years ago

Would kinda like to hear more about uses cases.

Expanding on option 3, what if we were to expose an API and then users could use that to pull some data, do their dremel/fancy internal visualization tools.

natorion commented 7 years ago

Other use case that just came up after talking with our memory sheriffs:

Easily compare memory consumption on different devices and focus on percentiles.

anniesullie commented 7 years ago

@martiniss you may want to do something similar for test runtimes once they're in the dashboard, if you have any proposals definitely throw them on here! Dashboard data model is documented here: https://github.com/catapult-project/catapult/blob/master/dashboard/dashboard/models/graph_data.py

anniesullie commented 7 years ago

We settled on providing a JSON API. @natorion can you list the use cases that v8 team needs?

natorion commented 7 years ago

We want to derive the impact of a V8 CL. Regarding the performance we would need the following functionality in order of importance:

Must-have: Same data as in "group_alert?rev=12345"

Nice-to-have: Bisection progress for a revision: How many bots have already processed this revision? I think there is already an issue asking for the same thing but for showing it on Chromeperf.

Nice-to-have: SQL-queries: Use cases for that can be seen in the initial post. This has nothing to do with our impact analyzer though.

anniesullie commented 7 years ago

Update: We don't have all the API requests made in this bug, but we do have a basic alerting API with authentication: https://github.com/catapult-project/catapult/blob/master/dashboard/dashboard/api/README.md

eakuefner commented 7 years ago

We didn't implement BigTable functionality, but we ended up addressing the needed functionality by providing a JSON API as described above.