tammoippen / plotille

Plot in the terminal using braille dots.
MIT License
398 stars 17 forks source link

Add support for drawing histogram with pre-aggregated (buckets + counts) data #51

Closed Kami closed 1 year ago

Kami commented 1 year ago

Background, Context

A lot of Python libraries such as OpenTelemetry Metrics API (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md) support histograms.

The problem is that those libraries expose pre-aggregated data (counts for each bucket) and not raw data. This is done intentionally since storing raw data could be very expensive in terms of memory usage in case there are a lot of samples.

Here is an example of data available by OpenTelemetry metrics Histogram view:

 {
  'attributes': {},
   'bucket_counts': [1945, 0, 0, 0, 0, 0, 10555, 798, 0, 28351, 0],
   'count': 41649,
   'explicit_bounds': [10, 50, 100, 200, 300, 500, 800, 1000, 2000, 10000],
 }

As you can see, we get we get counts for each bucket. Buckets go from (n[i], n[i+1]) whereas first item is -infinity and the last item is +infinity. In this example, buckets would look something like this: [(-inf, 10), (11, 20), (21, 50), ..., (10000, +inf)].

Proposed Change / Implementation

In this change I added a new function for graphing data which is already split into counts and bins.

This pull request is not final (I just copied the existing code + made small change to show how it should work), I just wanted to get the discussion going (with some code and examples) so we can agree on the API which should be exposed.

With the sample data above, this new function and bins which look like this:

counts = bucket_counts
bins = [float("+inf")]
bins.extend(explicit_bounds)
bins.append(float("+inf"))

We get the following histogram:

        bucket       | ________________________________________________________________________________ Total Counts
[inf     ,       10) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⠆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 3269
[10      ,       50) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[50      ,      100) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[100     ,      200) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[200     ,      300) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[300     ,      500) | ⣿⣿⣿⣿⣿⠆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 1890
[500     ,      800) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 9706
[800     ,     1000) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[1000    ,     2000) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡗⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 8709
[2000    ,    10000) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀ 28213
[10000   ,      inf) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾

Feedback is welcome, especially on naming and things like that.

TODO

Kami commented 1 year ago

@tammoippen Thanks for the review.

Do you think it would also make sense to implement the same changes for histogram() function?

I quickly looked at the implementation and that would require more re-factoring to get it to work (Figure.histogram(), Histogram class, etc.).

tammoippen commented 1 year ago

@tammoippen Thanks for the review.

Do you think it would also make sense to implement the same changes for histogram() function?

I quickly looked at the implementation and that would require more re-factoring to get it to work (Figure.histogram(), Histogram class, etc.).

I think for this PR the change to hist is enough. Afterwards, we can see, if we introduce aggregated variants to Figure.

Kami commented 1 year ago

I believe I've addressed all the feedback:

Please let me know if I missing something.

codecov[bot] commented 1 year ago

Codecov Report

Base: 97.10% // Head: 97.11% // Increases project coverage by +0.01% :tada:

Coverage data is based on head (b506b51) compared to base (589fe49). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #51 +/- ## ========================================== + Coverage 97.10% 97.11% +0.01% ========================================== Files 12 12 Lines 1070 1075 +5 Branches 216 216 ========================================== + Hits 1039 1044 +5 Misses 12 12 Partials 19 19 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `97.11% <100.00%> (+0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/tammoippen/plotille/pull/51?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen) | Coverage Δ | | |---|---|---| | [plotille/\_\_init\_\_.py](https://codecov.io/gh/tammoippen/plotille/pull/51/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen#diff-cGxvdGlsbGUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | | | [plotille/\_graphs.py](https://codecov.io/gh/tammoippen/plotille/pull/51/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen#diff-cGxvdGlsbGUvX2dyYXBocy5weQ==) | `87.67% <100.00%> (+0.90%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tammo+Ippen)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

tammoippen commented 1 year ago

Thank you for the PR. I will sort out the CI errors, they are not linked with your changes. 🎉

Kami commented 1 year ago

@tammoippen Thanks.

I also tried to get tests / CI to work under Python 3.11, but it turns out it's hard to to because the project still supports Python 2.7 and there are some backward incompatible changes in newer versions of pytest which are needed for newer Python versions.

Having said that, I see you dropped support for 2.7 (#52), so I will revisit that change and see if I can make it to work.

EDIT: Never mind, I just noticed you've already done that as part of #52 - thanks!