plotly / plotly.js

Open-source JavaScript charting library behind Plotly and Dash
https://plotly.com/javascript/
MIT License
17.03k stars 1.87k forks source link

wrong range in hover info for basic histogram #5848

Open bklingen opened 3 years ago

bklingen commented 3 years ago

The hover information on the bin range shown for this basic histogram is rather misleading:

image

I would have expected a range of "50 - 100" shown for the second bin. (The same mislabeling occurs for all other bins.)

Codepen:

https://codepen.io/bklingenberg/pen/zYwdaOq?editors=0011

(This may be relate to discussions in #2113.)

alexcjohnson commented 3 years ago

This is all in service of greater clarity at the bin edges. To be precise, what's happening here is two things:

What we really DON'T want to do is have labels 50-100 and 100-150, because then it's ambiguous in which bin we put a value of exactly 100. But you could perhaps argue that the bin shift should match the range shrinkage - ie because we shifted the bins exactly 0.5 here we should also shrink the ranges we report by exactly 0.5 on each side, to 50-99, or if we want to keep 50-90 we should shift the bins by 5.

nicolaskruchten commented 3 years ago

I could also see us listing the bounds of the data in each group, or using [50-100) style notation or even >=50 & <100

bklingen commented 3 years ago

Thanks for commenting! I actually think labels 50-100 and 100-150 are not too ambiguous, and to me seem better than the current 50-90, 100-140, 150-190, etc. which indicate gaps. I interpret the hover information as giving me the range of the bin (and the count of observations falling in that bin). This is especially valuable when the x-axis tickmarks are not set to coincide with the boundaries of the bins, as is often the case. Then, I really would like to know the lower and upper limit of the bin I'm hovering over.

I don't think giving the range of the observations (e.g., 50-90) falling in the bin of 50-100 is as useful. It is the lower and upper bound of the bin that is the interesting information.

To me, the optimal solution is [50,100) for the default half-open intervals that plotly forms. (Or 50 - 99, where the precision can be set with hoverformat. I.e., hoverformat = ".2f" would yield 50.00-99.99.)

nicolaskruchten commented 3 years ago

Right, so having thought about this more, as Alex says, the bin bounds aren't actually 50-100 here, they're 49.5-99.5 (due to the smart behaviour around integers). We can maybe debate later if this is a good idea, but at the very least we should frame our discussion around the actual bounds :) I think then that for this specific chart the hover should be [49.5, 99.5]

I should note that forcing the bins to 50-100 (verified by zoom!) with xbins: {start: 0, size: 50} still gives a hover label of 50-90 or 50-99 depending on the largest data value in that range and whether or not they're all integers, which seems like excessive coupling, and is also misleading.

But listing exactly those values in the hover label would be confusing: what are half-integer values doing in a label for integer data?

I don't think that's confusing, personally, I think it's clarifying :)

bklingen commented 2 years ago

Hi,

I'm just trying to bring this issue/bug up again, as having correct hover information on the length of the bins would be a great enhancement for histograms.