Kitware / candela

Visualization components for the web
https://candela.readthedocs.io
Apache License 2.0
116 stars 29 forks source link

Add a functional boxplot candela component #234

Open alex-r-bigelow opened 8 years ago

alex-r-bigelow commented 8 years ago

Ideally, I imagine that we'd assume that these would be precomputed and already part of the dataset. But that would mean that we should have a mechanism in the candela API to associate secondary attributes to a primary one (e.g. set one data column to be the mean, another to be its 95% confidence interval).

Or maybe it would be better to create an entirely separate functional boxplot component that only draws one line, as opposed to the line chart component that shows an arbitrary number of lines (but with no secondary attributes)?

waxlamp commented 8 years ago

I want to be careful about stuffing too many features into what should be a simple component. "Line chart" right now does more or less what is written on the tin. We should think carefully about where a line chart with error bars or line chart with confidence interval ought to live.

alex-r-bigelow commented 8 years ago

Agreed. I'd lean toward creating a separate functional boxplot component for a single line (not just because it's easier, but from a vis perspective, a line with that much detail should probably have its own distinct chart anyway).

But I'm wondering if this is symptomatic of a deeper problem. We've run across similar questions with lat/lon stuff, where two attributes have to go together, but we might want to accept multiple pairs of attributes.

I'll close this in favor of #175 to preserve the link in case this comes up again

waxlamp commented 8 years ago

Did this feature request come from a user? If so, we should keep it open (possibly changing the name, demanding a new line chart type that can handle quartiles, etc.). I don't think what this issue is asking for is the same thing as #175.

waxlamp commented 8 years ago

I learned the term "functional boxplot" today. Excellent! :mortar_board: :mortar_board: :mortar_board:

alex-r-bigelow commented 8 years ago

The new component should accept one (?) attribute for the center curve (e.g. mean, median), n pairs of attributes as envelopes (e.g. lower / upper bounds of confidence intervals), and n attributes as outlier curves

Or if we decide on a different standard for things like lat/lon coordinates (e.g. accept a single number tuple attribute instead of two number attributes), the n pairs should take a similar approach.

jeffbaumes commented 8 years ago

@alex-r-bigelow Using the same approach as geo locations depends on how these confidence pairs are normally encoded "in the wild" and if they are similar to how locations are encoded. I could also see Candela or Resonant Lab being able to compute the confidence intervals from raw data - I assume they are often computed specifically for the visualization in other tools like R.

alex-r-bigelow commented 8 years ago

True, we could calculate confidence intervals ourselves... but I tend to want to punt on any kind of derived data. I'd rather let the user do it in another tool where they'll have more control. I think we're running with the assumption in Resonant Lab that we can set expectations for the shape of the data, because we're planning on integrating with existing wrangling tools and / or writing our own.

I agree we should try to figure out the shape that these things take most frequently in the wild, and use that. But it's usually the case that someone has data that doesn't fit the format that we/most people expect/support, so reshaping is almost always going to be necessary anyway—I don't think it's worth trying to anticipate too much.