tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.5k stars 2.02k forks source link

Modify fun.data in stat_summary for multiple imputed data #2219

Closed saudiwin closed 6 years ago

saudiwin commented 7 years ago

I am interested in using stat_summary with datasets that comprise separate runs of a multiple imputation algorithm. It seems like this should work great given stat_summary's use of the fun.data option, but currently stat_summary is set up to only pass a numeric vector to the fun.data function. In order to calculate the correct means and standard errors, however, it is necessary to also pass an ID variable representing the imputations because standard errors have to differentiate between the within-imputation and between-imputation variance.

I am obviously willing to write the function, but I am looking for some advice on what I would have to do to get this to work with fun.data. Is it possible to pass in the index via a column attribute in the data frame without modifying any ggplot2 code? Or would it require some kind of patch?

karawoo commented 7 years ago

I'm not sure I follow what you want to do – would you mind creating a minimal reprex (reproducible example) with sample data that shows what you'd like to accomplish? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you!

If you've never heard of a reprex before, start by reading "What is a reprex", and follow the advice further down the page.

hadley commented 6 years ago

I've closed this issue due to lack of requested reprex. If you still care about this bug, please open a new issue with a reprex.