bencherdev / bencher

🐰 Bencher - Continuous Benchmarking
https://bencher.dev
Other
512 stars 22 forks source link

Problems with large numbers of metrics #474

Open joshka opened 1 month ago

joshka commented 1 month ago

https://bencher.dev/perf/ratatui-org

We have one metric which measures a worst case scenario and which is much larger than all the rest. This means that it dominates the graph

image

Turning that off gives another group of metrics that are all in the same area:

image

Turning those off gives us a bit more of a reasonable set of metrics:

image

In criterion, we use groups with Ids that contain the method name and a size parameter: e.g. https://github.com/ratatui-org/ratatui/blob/main/benches/barchart.rs#L11-L27. It would be nice to make these groups, methods, and the size parameters be something that can used to select which methods to show.

Additionally, when there's a large amount of metrics, all with the same branch and machine name, this repeated info is overly verbose. While this looks pretty, a simple vertical list of the metric names with checkboxes would be much more usable (Discovering that the title box was clickable was a non-obvious user interface thing).

P.s. it would be nice to have a copy image button on the chart. Especially if there's some way to include a legend that makes sense. In a previous life at a very large company I worked at we had similar info displayed underneath the chart as labels. For some inspiration, take a look at how CloudWatch displays this sort of thing (random image found on image search):

cloudwatch

epompeii commented 1 month ago

We have one metric which measures a worst case scenario and which is much larger than all the rest. This means that it dominates the graph

Yes, this is definitely less than ideal. I'm open to suggestions for better ways to plot a Report. For context this was added to overcome the "cold start problem" when viewing results. I figured it was easier to trim down the results than try to hunt down all the correct dimensions: https://github.com/bencherdev/bencher/issues/133

When this solution was originally created though, the format of the Report was very much in flux. With the Report format now more or less stabilized, I think it is worth reevaluating this design decision. A related step that I think would be of interest here is creating a Report results table view: https://github.com/bencherdev/bencher/issues/421

In criterion, we use groups with Ids that contain the method name and a size parameter: e.g. https://github.com/ratatui-org/ratatui/blob/main/benches/barchart.rs#L11-L27. It would be nice to make these groups, methods, and the size parameters be something that can used to select which methods to show.

Agreed! I think this would be covered by custom tags, once those are added: https://github.com/bencherdev/bencher/issues/240

In the mean time, have you taken a look at Pinned Plots? I need to add better docs around them, but Pinned Plots allow you to save queries and create a dashboard like interface. You can create a Pinned Plot for each of these parameter permutations to easily keep track of them. They also appear in the Perf Plot under the "Pinned" tab, so you can quickly pull them up.

Additionally, when there's a large amount of metrics, all with the same branch and machine name, this repeated info is overly verbose. While this looks pretty, a simple vertical list of the metric names with checkboxes would be much more usable (Discovering that the title box was clickable was a non-obvious user interface thing).

Very true. When a Report is selected these can very safely be deduplicated. I've created a tracking issue: https://github.com/bencherdev/bencher/issues/475

I also like the table view in the example below as an alternate (less pretty, more useful) view. Another thing that is difficult with the current design is easily seeing all the dimensions that are

P.s. it would be nice to have a copy image button on the chart. Especially if there's some way to include a legend that makes sense. In a previous life at a very large company I worked at we had similar info displayed underneath the chart as labels. For some inspiration, take a look at how CloudWatch displays this sort of thing (random image found on image search):

Have you taken a look at the Perf Plot Image yet? If you create a Perf Plot and then hit the "Share" button, a modal should pop up. This generates an image version of the plot. For example:

Ratatui Example for Ratatui - Bencher

... and I now have a tracking issue for the y-axis running off the side of the image: https://github.com/bencherdev/bencher/issues/476 🙃

Is there some other type of image that you would want though?

joshka commented 1 month ago

Yes, this is definitely less than ideal. I'm open to suggestions for better ways to plot a Report.

In the proposed list view, show the range of values as a column which allows sorting. This way it would be easy to sort the outlier metric to the top and then deselect it from display.

Is there some other type of image that you would want though?

Mostly a static representation of the image data that is displayed. This should be fixed at a point in time, not parameterized. Think about how you'd communicate in a issue / slack message / email about a PR that changes performance characteristics "This PR causes a regression, here's what that looks like as an image". then "This update to the PR fixes the regression now, here's the updated image". You want that image to be created once when you snapshot it. The share link in our bencher page is pretty ugly.

On the "have you tried" questions, I haven't yet. I'm fairly new to looking at the bencher reports / UI. Those features seem to have a discoverability problem though, which might be worth considering. (How should these be presented in the UI in a way that makes them obvious).

As another aside, you might want to move many of the chart display parameters to post params instead of query params, and only generate the query parameters when specifically asked to provide a permalink, or generate these in a way that leads to a much shorter url. I'd also suggest considering whether you can move away from guid identifiers into shorter synthetic ones that are more contextual to the user, at least at the presentation layer (which includes the url query string)

epompeii commented 1 month ago

In the proposed list view, show the range of values as a column which allows sorting. This way it would be easy to sort the outlier metric to the top and then deselect it from display.

Great suggestion!

Mostly a static representation of the image data that is displayed. This should be fixed at a point in time, not parameterized. Think about how you'd communicate in a issue / slack message / email about a PR that changes performance characteristics "This PR causes a regression, here's what that looks like as an image". then "This update to the PR fixes the regression now, here's the updated image". You want that image to be created once when you snapshot it.

The Perf Plot image is fixed at a point in time, as long as both a start and end date is set. Is there something else you would like to see here?

It is the pinned Plots that are parameterized to a sliding window.

The share link in our bencher page is pretty ugly.

Agreed! Current tracking issue: https://github.com/bencherdev/bencher/issues/404 I'm very much open to feedback on that design as well.

On the "have you tried" questions, I haven't yet. I'm fairly new to looking at the bencher reports / UI. Those features seem to have a discoverability problem though, which might be worth considering. (How should these be presented in the UI in a way that makes them obvious).

I tried to make this front and center by making the pinned Plots view the default page for a Project and having a "how to" box on that page.

How did you perceive this though? Is there anything confusing or misleading there?

As another aside, you might want to move many of the chart display parameters to post params instead of query params, and only generate the query parameters when specifically asked to provide a permalink, or generate these in a way that leads to a much shorter url.

The benefit of query parameters is that they allow for a true URL driven experience. Maybe since I'm the one hacking on things, I find the ability to inspect and URL hack more useful than the average user, but IMO this is a very useful feature.

Additionally, the Bencher Console Perf Plot query params are a superset to the Bencher API Perf query params. This allows for very easy debugging from URL to checking what one gets back from the API endpoint.

With all that said though, I definitely think more sharable links as discussed above are needed.

I'd also suggest considering whether you can move away from guid identifiers into shorter synthetic ones that are more contextual to the user, at least at the presentation layer (which includes the url query string)

Great suggestion! My current thinking is to add prefixed IDs, similar to Stripe: https://github.com/bencherdev/bencher/issues/318 I'm open to design feedback here as well!

joshka commented 1 month ago

This is a very much not hackable url :)

https://bencher.dev/perf/ratatui-org?key=true&reports_per_page=4&branches_per_page=8&testbeds_per_page=8&benchmarks_per_page=8&plots_per_page=8&reports_page=1&branches_page=1&testbeds_page=1&benchmarks_page=1&plots_page=1&report=4327b7db-e0fc-4b27-a04f-3baaba2c37c3&branches=95ce51f3-9a78-41e8-8700-562f11680798&testbeds=0615b230-cbf8-4ea6-8e2e-616c282b102a&benchmarks=adb521a6-df19-4ee9-af93-e783b69a4dc0%2C7bada371-e16a-475b-9424-af842fd2dd70%2C5695514c-6501-44a4-9a43-9de69078be9c%2C52a7f340-826a-46e7-a56a-ac84a6870405%2C7705aec6-4b65-4e53-855e-5b3fc7217afb%2Ce73ff304-fe7e-48c9-8bdf-b598790a043d%2C1ff230ce-8ddc-4206-bcb0-7a812cf4d658%2C8dab1d35-d913-4685-a8a4-5140e9663644%2C0ee346df-3fa6-4a0b-84d5-9d77f45472ab%2C41e1638b-7dcf-467b-a4df-43ebe5840a0e%2C9bda67ec-605a-440b-8849-10d3ef11a82c%2C01b1e197-ef5a-4ece-a59a-f72bb0b2c4c6%2C8ed00275-11c6-42c5-8e1d-13d56f0ec721%2C57786add-2507-43a8-8346-cf7bd7cca1e8%2Ca6a0bde0-b654-4fa4-a232-5c314ad42507%2C411b66ef-8278-4521-9dd5-8a846deba0d0%2Cee5c2e31-4073-453f-8068-6684658e0bcf%2Cd5bc2324-0b10-442d-963f-fcfee197a6ac%2Cfa81745b-42f8-43a7-ba9b-6db5462bb1ed%2C7f0372d1-40e8-4b90-a16e-5f5e4191587e%2C4f892eb4-b451-4fd7-8bcd-1eec33053e21%2Cc107db1e-e99e-42f6-a883-2058ac908b96%2C226790b4-f38b-44a7-a95c-db8ac8af0252%2Cc359d8d5-4278-41a6-9fd5-3ba374ba8639%2C4e29a902-9174-4f03-a8e6-0b8b9c0f95a1%2Cc4f7d392-7d3d-4af4-9f99-2b63b6354dc1%2C69f12378-9c7a-49ee-a790-78d39ebe83db%2Cb1ed9483-becf-4757-853d-2c2686885791%2C2a59de52-2fca-4bbb-8168-1caadb944665%2C9e1d06fe-0303-474f-8f22-436b250cddee%2Ca210aadc-4c32-4413-96d0-dbf19bb8b8a3%2C0f56e5e2-01d9-4758-9b70-9bc38c054909%2C022356ec-3969-4c88-9085-7e706fe12fc3%2C461a87a3-5a55-406d-b06d-c1c3b0f79e30%2Cb129e05d-be31-4965-93c9-8446be712d7c%2C676715ef-a451-4d61-ab51-0380c3e8704a%2C6aa7edb7-a2cf-4b9b-85cd-e98d8b7120b2%2C2cb37736-377a-4f7d-a648-2da43292f44e%2C95605e28-1089-4628-9adc-f23a1ddc2e0a%2Cdb84bbfa-a7b7-4ca6-88f5-30c8cac75107%2C9ce381da-c6bc-486d-991a-59dffbfd05c5%2Cc0562270-6a11-4138-9ada-fd7ec0333c0d%2C03d1971d-9315-48ec-85bd-252226d512b4%2C7b0c2cff-4e95-4697-9398-21480a870fe9%2C6effea6c-f17e-4fff-87ff-64a825160556%2C6fc9ea50-be34-4138-845f-f0f2d203ad80%2C42f4182f-878b-4f78-82b5-cba33a3b310f%2Cc03b2598-4891-4230-9a6e-dee0e589abb1%2C372f0793-6ae6-48e1-8875-892fc37019e4%2C15c90124-4041-4e8e-afca-eceebfbb9cf5%2C3d92c285-fc54-4cee-8726-486deab4c79d%2C409bfafc-e893-4477-8339-d24d592e1879%2Cc5314041-8b94-4dc5-b7ee-b6c7f2b27a0a%2C7af7a571-6ee2-4926-9fc5-139b0aec2344%2Cd058e408-76aa-4513-a71c-2f4dcfd2a3e8%2Ce7b416a9-d208-4eaa-b4b5-1f686a02d572%2Cc4b886d7-c4ba-4426-b491-396d4e168393%2Cdd6c6634-0c1c-422a-aa1b-e05bc159617c%2C8bea06fc-8e47-402d-9d89-d15624e741f4%2Cc7af21ef-076c-4a35-8a6e-333af9d31267%2C3d2b379b-47aa-4970-801f-441ff42336bd%2Ca9ed3483-2698-4d89-b013-b3bf784b5b08%2Cc180e17f-62c2-45de-9472-9cf8574ccd93&measures=b917dd68-60ef-41c6-8ce9-2164eba4f46b&start_time=1720312624000&end_time=1722905503000&clear=true

I'm looking at the chart as a user that doesn't have permissions on the page yet. Pinned is just a tab at the bottom with no instructions, and there's nothing that aligns with what you're calling "perf plot image". This may be the language that you're using internally, but it's not surfaced on the UI as far as I can see.

image

+1 to the stripe object ids.

image

epompeii commented 1 month ago

I'm looking at the chart as a user that doesn't have permissions on the page yet. Pinned is just a tab at the bottom with no instructions, and there's nothing that aligns with what you're calling "perf plot image". This may be the language that you're using internally, but it's not surfaced on the UI as far as I can see.

The Perf Plot Image is shown when you hit "Share". It should look like this example from the README: https://github.com/bencherdev/bencher#share-your-benchmarks

Ratatui - Bencher

epompeii commented 1 week ago

Update: As of the most recent release, the entire Report is no longer plotted all at once. Instead only the first few (10) permutations are plotted. A table view of the Report is now available in the tab view to click and view the plot for a specific benchmark result. If a specific comparison plot is desired, the pinned plots feature should be used. To help encourage this, public plots pages are also a part of this release. Once these plots are "pinned" they will be available here: https://bencher.dev/perf/ratatui-org/plots