Closed lizgehret closed 1 month ago
@lizgehret this is great! I will be interested to test this once it is ready! (just tried but could not get it to work with real data; happy to share an error log if this is unexpected but I assume I am just jumping the gun 😁 )
I think that a drop-down for facet_by
would be useful. Often when plotting data, users will want to look at different groupings. To give some concrete examples: with environmental/soil data like the EMP data, users might want to look at distributions at different EMPO levels (i.e., different types and subtypes of ecosystems); in human data (e.g., HMP), maybe different body sites and subtypes, or patient categories; In the PD mouse dataset or similarly structured data, they might want to look at multiple categories in the metadata like "host", "donor", and "treatment". Having all of this in a single plot would be convenient; though alternatively there could be multiple plots displayed instead of a drop-down, and the user could input a list of categorical column names to facet_by
.
I think that a drop-down for the numeric measure (distribution
) would be useful. E.g., if plotting alpha diversity per group, a user might want to toggle between multiple metrics (also for beta diversity, e.g., distributions of pairwise distances). Alternatively, there could be multiple plots displayed in the viz, one per measure selected (distribution
could accept a list), but I like the dropdown.
I suggest making percentile the default for whiskers, but it is always a matter of taste and both are common.
Thanks for the feedback @nbokulich! I will definitely let you know once this is ready for a test drive - I've still yet to fill in the vega spec 😅 Here are some design updates after a discussion with @ebolyen this morning:
Now that things are a bit more fleshed out, I'm going to start working on the actual spec. Should be in a working state sometime next week!
The transpose signal is most likely getting punted to v2 because vega doesn't like swapping axes of differing types: https://github.com/vega/vega/issues/1176
note to myself: also need to add the same legend[data]
hack if group_by
field is none
This is not quite finished but is now ready for some test driving! 🚘 cc @ebolyen wanna take a look and lmk if there's anything that could be improved/changed? I'll follow up on any requested changes when I'm back next week 🙂
outstanding to-do's:
tukeys_iqr
(results currently look identical to minmax)box_alignment
param for either vertical or horizontal box alignmentwhisker_range
method'Okay, this is finally ready for review @ebolyen!
I may have gone a little overboard with the test suite... but I wanted to make sure all of the visual elements were being tested, as well as the actual stats calculations. I think I've been staring at this for too long, so if anything doesn't look reasonable or there's a better way to organize things, let me know!
Copying over my proposed design from basecamp for visibility: