lizzieinvancouver / grephon

0 stars 1 forks source link

figure path diagram thing #10

Closed lizzieinvancouver closed 1 week ago

lizzieinvancouver commented 1 year ago

@rdmanzanedo is starting on this.... Make a figure that summarizes how studies have addressed this question while also showing the basic way we think they should address it. He will have an update next week, but he might very well want help figuring out how to get the info we need out of the table, and more ideas on how make this work would be great.

rdmanzanedo commented 1 year ago

Dummy_figure

rdmanzanedo commented 1 year ago

ok, quickly put together a dummy figure. Idea is lines are proportional to number of studies (I think most of the lines would be direct references to single columns in the table but I may be wrong).

Histograms are fully dummy!

Idea of SOS/EOS separate is an imperfect way but common. In the text look at the proportion that look at the key connection (GSL-growth) compared with other definitions (SOS-growth; EOS-growth, exog.-growth, eng-growth).

Colors in histograms are study types with the idea of discussing systematic biases (e.g. tree rings always call exogenous-growth as if it would be GSL-growth)

rdmanzanedo commented 1 year ago

if we want to highlight the way we think should be done, we could highlight those arrows.

rdmanzanedo commented 1 year ago

If we agree in this design, collecting the data from the final table and building the actual figure would be quite fast afterwards. Opinions?

lizzieinvancouver commented 1 year ago

@rdmanzanedo This is awesome and I can't believe how fast you made it! I have been only spitballing about how to show the contrasting findings based on methods (e.g., annual cores never tests GSL to growth) ... maybe have multiple arrows colored by method? We should NOT change anything yet, I am just sharing my thoughts.

@jannekehrl @kavs-P @AileneKane @alanaroseo @FrederikBaumgarten -- please send thoughts!

rdmanzanedo commented 1 year ago

@lizzieinvancouver Yes! that's a great idea to see by method. I would suggest have a supplementary figure where we have one of this path diagrams per method, to see what is doing who (would be big for the main manuscript but useful for discussing).

alanaroseo commented 1 year ago

@rdmanzanedo Yasss, this is amazing! Agreed with Lizzie that another layer of contrast would be interesting to add. I don't understand why exogenous factors has 2 peripXprovenance bars in the plot, but otherwise it reads well.

rdmanzanedo commented 1 year ago

@alanaroseo yeah, sorry if I was unclear, the barplots are just placeholders that I made up, I was lazy to write the bar labels different so just copypasted hahaha

lizzieinvancouver commented 12 months ago

Update from Monday's meeting:

We all liked the figure. Some points were that it might be interesting to include what the ways to measure GSL are (following Koerner paper), whether GS was measured via EOS, SOS, GSL, and maybe some other details. Consensus was to see if this makes the figure too busy, and stick with slimmed down if necessary. Or perhaps move the histograms to a side panel. Finally, given that different study types have different arrow lengths, one option is to have the figure broken down by study type but in the appendix.

rdmanzanedo commented 11 months ago

ok, let's start getting this data. We will essentially need to identify the relevant variable for each path diagram and then obtain the data. I played around with one of them (becuase I think that the grephontable.csv up there is not yet the last updated one with everybody changes right?'.

This figure requires 2 pieces of data in its simplest form: Histograms and path values:

### 1) Histograms: These are the variables I consider relevant for them: Exogenous factors === > ifyes_whichexternal Endogenous factors ===> ifyes_whichendogenous GSL ===> gsl_metric Growth ===> growth_metric

@lizzieinvancouver is that correct? @all what do we do with the very many variables (more info below)

rdmanzanedo commented 11 months ago

For example, for the exogenous, we would have this:

Exogenous1

Too many categories. I would suggest decide to simplify them in major categories (e.g.: temperature, precipitation, day lenght, CO2, soil moisture, and some other ones) and then assign to each a 'composition vector of each'. Given the size of data, this probably easiest done by hand, specially if we divide the work. It would look something like this:

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Original exogenous | Temperature | Precipitation | DayLength | CO2 | ... -- | -- | -- | -- | -- | -- spring temperature | 1 | 0 | 0 | 0 | ... latitude | 0 | 0 | 0 | 0 | ... VPD,SWP,Temp,Precip | 1 | 1 | 0 | 0 | ...

@All does that sound reasonable? I'm sure there are more elegant ways of doing this automatic but I think the variability of labels and the reduced number of data points to correct makes it easier to just do it by hand. For this 4 variables (I think gsl and growth are way more controlled in their input)

rdmanzanedo commented 11 months ago

### 2) Paths First for each we will add the number that considered each variable. E.g. for exogenous factors by this simple calculation: #exogenous, called 'external factors' in the table #% that looked at external factors ext.prop = data.frame(table(grephontable$authorslooked_externalfactors)) ext.total.prop = ext.prop$Freq[ext.prop$Var1=='no'] / sum(ext.prop$Freq) #21% DID NOT looked at any external factor, 12/56 (44 DID, to some extent)

Then for the path widths, 7 paths necessary: Exo-GSL - authorslooked_externalfactors == "all the yes with GSL in the title", example below Exo-Growth - authorslooked_externalfactors == "all the yes with Growth in the title" Endo-GSL - authorslooked_endogenousfactors == '"all the yes with GSL in the title" Endo-Growth - authorslooked_endogenousfactors == '"all the yes with Growth in the title" GSL-Growth - ourdefinition_evidence_gslxgrowth == 'yes' + 'no' + 'negative relationship'

Exo-SOS/EOS - gs_metric_used == 'start metric only' + 'end metric only' AND authorslooked_externalfactors == "all the yes with GSL in the title" SOS/EOS - Growth gs_metric_used == 'start metric only' + 'end metric only' AND authorsthink_evidence_gsxgrowth == 'yes'

rdmanzanedo commented 11 months ago

(I am less sure of the adequacy of the SOS/EOS ones, we should quickly discuss them) I also realize now that all the GSL involving ones (first 5 should be with gs_metric_used of whole GSL (multiple)

rdmanzanedo commented 11 months ago

quick example: with Exo-GSL

#which ones ext.id = subset(grephontable, grephontable$authorslooked_externalfactors!='no') ext.id.table = sort(table(ext.id $authorslooked_externalfactors))

names(ext.id.table) par(mar = c(17,3,3,1) ) barplot(ext.id.table, las=2, col=c(colors[1], colors[1], colors[3], colors[3], colors[3],colors[2],colors[1]), border=0, ylim=c(0,15), ylab='Frequency', main='Exogenous: on which variable') abline(h=0)

Exo-GSL

###this is interesting, so the value that informs Exogenous to GSL is 'yes - gsl', 'yes - length of growing season', 'yes- growth and...' path.exogenous.to.gsl = sum(ext.prop$Freq[ext.prop$Var1 == 'yes - gsl' | ext.prop$Var1 == 'yes - growth, yes - length of growing season' | ext.prop$Var1 == 'yes - gsl and growth' | ext.prop$Var1 == 'yes - length of growing season'])

path.exogenous.to.growth = sum(ext.prop$Freq[ext.prop$Var1 == 'yes - growth (NPP)' | ext.prop$Var1 == 'yes - growth, yes - length of growing season' | ext.prop$Var1 == 'yes - gsl and growth' | ext.prop$Var1 == 'yes - length of growing season' | ext.prop$Var1 == 'yes - growth'])

So, the with of the path Exo-GSL with all studies would be 13 and the Exo-Growth would be 27 (we will likely express it as a proportion of the total number)

once we have this, it is easy to repeat the values for different disciplines and for those with multiple species or across continents.

rdmanzanedo commented 11 months ago

(obviously merging there would be good, but I think that is something we are going to have to do anyway, forgot to say)

Therefore, key points for this: -Agree on the variables for the histograms -Agree on the variables and conditions for the paths -Decide on how much to merge categories (particularly for histograms) and whether to do it by hand or somebody has a way of doing this automatically in a reliable way. -Required figures to build: +One path analysis for all of our reviews +One figure with parallel path diagrams per study_type +Bins of number of species or replication size? (is this of interest?)

lizzieinvancouver commented 11 months ago

Three messages here:

  1. Many variables for GSL -> growth
  2. Different fields are measuring this vary differently
  3. Endogenous and exogenous are capturing really different variables.
FrederikBaumgarten commented 11 months ago

I did the graph I had in mind. perhaps not directly useful for the path diagram but I kind of like it. ![Uploading Screenshot 2023-07-17 at 11.31.06 PM.png…]() scheme_external_internal_factors.pdf

lizzieinvancouver commented 11 months ago

Here's my thoughts:

gsltogrowth_emw

lizzieinvancouver commented 11 months ago

And re-posting @FrederikBaumgarten so easier to see without downloading: scheme_external_internal_factors

lizzieinvancouver commented 11 months ago

And from @rdmanzanedo and @FrederikBaumgarten: gsltogrowthfilers

lizzieinvancouver commented 11 months ago

Chatting with @rdmanzanedo I think he's right that we want to make sure we have a good figure that shows some of the main results from the table. I like the idea of it being an exploded version of a simple conceptual figure currently.

lizzieinvancouver commented 11 months ago

Few more notes from Monday meeting:

lizzieinvancouver commented 1 week ago

I loved this figure, but we ended up instead with heat maps (#41 ) and the hypothesis figure (#29 ).