Documentation/function to make an orchard plot like a forest plot

befriendabacterium commented 1 year ago

Is your feature request related to a problem? Please describe. It may sound like an odd one, but I'm wanting to create an orchard plot that substitutes for a forest plot i.e. an information-richer plot with per-study estimates (but individual effects within studies shown) and an overall estimate at the bottom.

Describe the solution you'd like I imagine this is possible in orchaRd already, but I haven't found specific documentation or a function to guide me on how to do this, which I'd like.

Describe alternatives you've considered My workaround at the moment is to create an intercept-only random effects model (model 1) and a random effects model with 'study' as moderator (model 2), and then plot the two alongside one another in a similar way to a forest plot. However, I'm not sure if this is appropriate. Metafor::forest takes a random effects model without moderators and produces a forest plot with per study estimates and overall estimates - is there a way to do this in metafor.

Additional context NA

Thanks!

befriendabacterium commented 1 year ago

Just sharing this link for extra context - seeking to do something like this but via orchard_plot. https://www.metafor-project.org/doku.php/tips:forest_plot_with_aggregated_values?s[]=forest&s[]=plot&s[]=aggregated

itchyshin commented 1 year ago

@befriendabacterium

I guess you can run a model with study_ID as a moderator - mods = ~ study_ID - then

orchard_plot(model,
             mod = "study_ID",
             group = "cluster",
             xlab = "Effecti size",
             cb = F, # this is important
             angle = 45)

Screenshot 2023-07-03 at 5 30 10 pm

I will chat with @daniel1noble to put more documentations etc

itchyshin commented 1 year ago

@daniel1noble - I also remembered that we probably need to think how to do with group when there is not clustering (I guess you can put obs_ID?) but this may not exisit when people use rma.uni not rma.mv

befriendabacterium commented 1 year ago

Morning Shinichi,

Thanks for helping me break the mental deadlock on this one, I've been mulling it for a while!

Yes, creating a model with study_ID as moderator was exactly the workaround I used, so glad to have some external validation for it! I paired this with another intercept-only model presented in an adjacent panel, to give something that attempts to be equivalent to a forest plot (see attached image, x labels cropped off as is not yet published).

However, two questions/thoughts emerge from this:

I'm wondering whether this produces something that is actually equivalent to your standard forest plot? Perhaps it is more of a statistical question, but are those per study estimates equivalent to the estimates from the intercept-only model produced by Wolfgang's aggregation method in the link above (link again for clarity: https://www.metafor-project.org/doku.php/tips:forest_plot_with_aggregated_values?s%5B%5D=forest&s%5B%5D=plot&s%5B%5D=aggregated)?
Ideally - which I think is what you're getting at in your second comment - rather than having to run two models, you'd be able to run orchard_plot() on a single model that only has an intercept (I think this is the same as no clustering?), producing a forest-like plot with per-study estimates and an overall/pooled effect (for now maybe just on an adjacent panel like mine, longer term on the same plot). This would seem to give me greater confidence that i'm showing the right things, and ease the write-up of the results.

Thank you both again for your hard work on this great package and engagement with my questions. If orchard_plot could be adapted to more easily produce a suped-up forest plot, I think it could take off even more than it has already and even supersede the forest plot hah! Happy to help in any way I can, even if you just want to have a quick virtual coffee about it!

befriendabacterium commented 1 year ago

Also, regarding your suggestion of running a study with model as moderator/point 1 in my comment above, would that model also include multiple within study estimates as a random effect? In other words, assuming someone is trying to produce a 'forest plot', should the model be:

english_MA1 <- rma.mv(yi = SMD, V = vSMD, mods =  studyID, random = list(~1 | StudyNo, ~1 | EffectID), data = english)

or should it be:

english_MA2 <- rma.mv(yi = SMD, V = vSMD, mods =  studyID,  data = english)

Thanks!

itchyshin commented 1 year ago

@befriendabacterium

Just a quick reply to the last one.

The first model will incorporate some of aggregation within studies etc into accout while the second one won't so the letter is closer but less accurate. I guess you also want to use VCV for V via vcalc

I will chat with @daniel1noble next week about your other questions and get back to you (your Point 2, I guess I kinda addressed Point 1 already).

befriendabacterium commented 1 year ago

Thanks Shinichi that makes a lot more sense worded like that - so 'english_MA1' is closer to Wolfgang's aggregated forest plot accounting for autocorrelation within studies - though I should use his method to try to replicate it more closely), whilst 'english_MA2' is closer to your 'standard' (less accurate) forest plot. So the former seems more appropriate.

I've tried to recreate Wolfgang's aggregated estimates using the 'study as moderator' method, but the results are slightly different. Apologies for the janky code but here is my best attempt, which is probably wrong:

dat <- metadat::dat.assink2016
### turn 'dat' into an 'escalc' object (and add study labels)
dat <- escalc(measure="SMD", yi=yi, vi=vi, data=dat, slab=paste("Study", study, " Estimate", esid))
dat$study<-as.factor(dat$study)

# METHOD 1: MOD+RANDOM ----------------------------------------------------

# set study to factor to avoid confusing the model
# dat$study<-as.factor(dat$study)

# run model using 'study as moderator' method
res1 <- metafor::rma.mv(yi, vi, mods=~study, random = ~ 1 | study/esid, data=dat, digits=3, rho=0.6)
#extract coefficients, converting to absolute estimates
res1_coeffs<-c(res1$b[1],res1$b[1]+res1$b[-1])

# METHOD 2: AGGREGATE+RANDOM ----------------------------------------------------

### assume that the effect sizes within studies are correlated with rho=0.6
V <- vcalc(vi, cluster=study, obs=esid, data=dat, rho=0.6)

### fit multilevel model using this approximate V matrix
res <- rma.mv(yi, V, random = ~ 1 | study/esid, data=dat, digits=4)
res

#aggregate effect sizes based on var-covar matrix
agg <- aggregate(dat, cluster=study, V=vcov(res, type="obs"), addk=TRUE)
#select columns of interest
agg <- agg[c(1,4,5,9)]
#extract coefficients
agg_coeffs<-agg$yi

#run model
res2 <- rma(yi, vi, mods=~study, method="EE", data=agg, digits=3)
res2

# COMPARE ESTIMATES FROM TWO METHODS --------------------------------------

res1_coeffs
agg_coeffs

plot(res1_coeffs,agg_coeffs)
abline(0,1)

Should the estimates from the two methods differ? Am I missing something? Sorry for the flurry of questions but hopefully this is at least partially helpful to you if you're thinking of adapting orchard_plot() to work where there are no clusters present in a model. Thanks again.

befriendabacterium commented 11 months ago

Hi folks,

Hope you're well. Let me know if you need a hand with this as know i've loaded you with a lot here and happy to help out if you would like especially as i have a vested interest in the solution!

Cheers Matt

itchyshin commented 11 months ago

@befriendabacterium - thanks for a reminder. @daniel1noble and I are in the middle of teaching so we have not had a chance to plan for changes - we will have more time in a month or so - probably more like Sept.

befriendabacterium commented 11 months ago

No worries - was just checking whether you were swamped as I suspected you might be, and therefore whether I could (try to) lend a hand. I'll try to have a crack at fleshing out some kind of solution on my fork of the repo this week, and make a pull request if I get anywhere near one!

Best of luck with the rest of the teaching both, appreciate all you do for the meta-analysis community!

Cheers Matt

itchyshin commented 11 months ago

@befriendabacterium - much appreciated. As you pointed out, doing orchard plots per study can have different options - fixed-effects, random-effects and other types of aggregations -but we can already do these things (and these are more like stats questions rather than graphical stuff) but we do not have good descriptions in our Vignette so @daniel1noble, and I plan to add these explanations in due course.

daniel1noble commented 11 months ago

Sorry guys, I replied too soon before as I just seen the initial post and the details that followed clarified a few things. I think this is all easily addressed with some updated documentation like @itchyshin says because aggregating on study should be fairly easy to do.

befriendabacterium commented 11 months ago

Hey folks sorry for the slowness on this. I've had time to digest your comments and think more about what a forest/caterpillar plots are set up to represent and I see your points much more clearly now. I see now that:

If a random effects model with only one effect size per study (perhaps gotten through aggregation), then this should be a forest plot or an orchard plot with no groups/intercept-only.
If a random effects model with multiple effect sizes per study, this should be a caterpillar plot, or an orchard plot with no groups/intercept only. Will be noisier than above perhaps necessitating colouring by study more.
If a mixed effects model with a categorical moderator, this should be an orchard plot and is where to plot truly shines.

i.e. the forest/caterpillar/orchard plots' role is essentially to show the direct inputs and outputs of the model, and it is therefore misleading to display group estimates where they did not inform the model (hence why I had to grab them from different models!)

So I think we're pretty much on the same page now and perhaps nothing needs doing (though I'm not sure exactly what is the needed extra documentation you're referring to). The only thing I would maybe suggest is for situation 2, you can have an option for something which which is a bit of a halfway house between an orchard and a caterpillar plot, where alongside the intercept group, you also display the 'fruits' grouped by study but without a per-study effect estimate (since there is none in this model). So fruits with no tree/that have fallen on the ground, if you like - but maybe i've taken the metaphor too far here lol. But I'll leave you to decide whether that's something that would be a helpful visualisation/that you want to include in the package.

Aware that you want to do more documentation of some sort but from my end this is resolved and I would like to do you both the favour of closing an issue rather than opening one. So I'm gonna close this but feel free to reopen if you want!

Thanks both for your insights!

daniel1noble / orchaRd

Documentation/function to make an orchard plot like a forest plot #35