collaboratively summing up and framing the INLA results in the paper

HedvigS commented 2 years ago

I would like input from @SamPassmore and @rdinnager on framing and summing up the INLA results in the actual paper. I hope we can discuss this in writing here in GitHub issue threads. I'd like to tackle one thing at a time in order so that we don't get overwhelmed by threads and messages.

We have two sections to write:

the shorter prose section in the main body of the article where ideally we write something like

To go beyond these visual impressions of both genealogical and geographic signals in the data [seen in the PCA scatterplots] we modeled the spatiophylogenetic effects using a new approach from biology, approximate Bayesian inference for Latent Gaussian Models (Dinnage et al 2020). This allows us to model the spatial and phylogenetic effects jointly and estimate their respective effects. We find that overall, the phylogenetic effect is stronger than the spatial - suggesting that grammar is more of an inherited property of language than one that disseminates spatially. Our features pertain to different domains of grammar (clausal, nominal, pronominal and verbal) and it is conceivable that some of these domains differ from each other in terms of the spatiophylogenetic effects. However, we do not find a difference in our results between these domains.

the supplementary material section where we go more in-depth into the methods and also discuss robustness testing such as the simulations and the five different models.

I'd like to work on (1) first. For clarity to the reader and for theoretical reasons, it is probably best if we focus in on either the dual or trial model in the main text, since these models allow us to estimate the spatial and phylo signal jointly, which makes most theoretical sense. If we compare the models on waic scores it is more often the case that the trial model scores the lowest.

  variable                n
  <fct>               <int>
1 phylogeny_only_waic     2
2 spatial_only_waic       1
3 AUTOTYP_area_waic      45
4 dual_model_waic         3
5 trial_model_waic       62

As I see if, we have these matters to decide on in order to progress to writing:

a) do we use a model fit score to compare dual and trial to decide which one we discuss in the main text or do we just decide that one of them makes more theoretical sense and discuss that one b) if we do use a model fit score, which one: CPO, PIT, DIC, milk or WAIC (discussed in #32 ) c) if we use the trial for the main text, we need to adjust the visualisations to show all three effects (can be discussed after (a) in #44 ) d) if we use the trial model for the main text, we need to provide more bulk on what the reasoning is behind the AUTOTYP-areas (@SimonGreenhill )

I would appreciate if @SamPassmore and @rdinnager could respond below with their input on the a-d items above. @blasid and @SimonGreenhill are also welcome to give input if they'd like.

I appreciate all of you and the time you spend on this project, thank you for all you've done so far.

rdinnager commented 2 years ago

I would like input from @SamPassmore and @rdinnager on framing and summing up the INLA results in the actual paper. I hope we can discuss this in writing here in GitHub issue threads. I'd like to tackle one thing at a time in order so that we don't get overwhelmed by threads and messages.

We have two sections to write:
1. the shorter prose section in the main body of the article where ideally we write something like
To go beyond these visual impressions of both genealogical and geographic signals in the data [seen in the PCA scatterplots] we modeled the spatiophylogenetic effects using a new approach from biology, approximate Bayesian inference for Latent Gaussian Models (Dinnage et al 2020). This allows us to model the spatial and phylogenetic effects jointly and estimate their respective effects. We find that overall, the phylogenetic effect is stronger than the spatial - suggesting that grammar is more of an inherited property of language than one that disseminates spatially. Our features pertain to different domains of grammar (clausal, nominal, pronominal and verbal) and it is conceivable that some of these domains differ from each other in terms of the spatiophylogenetic effects. However, we do not find a difference in our results between these domains.

This paragraph sounds good to me.

2. the supplementary material section where we go more in-depth into the methods and also discuss robustness testing such as the simulations and the five different models.

Let me know when and how I can help with this.

I'd like to work on (1) first. For clarity to the reader and for theoretical reasons, it is probably best if we focus in on either the dual or trial model in the main text, since these models allow us to estimate the spatial and phylo signal jointly, which makes most theoretical sense. If we compare the models on waic scores it is more often the case that the trial model scores the lowest.
  variable                n
  <fct>               <int>
1 phylogeny_only_waic     2
2 spatial_only_waic       1
3 AUTOTYP_area_waic      45
4 dual_model_waic         3
5 trial_model_waic       62
As I see if, we have these matters to decide on in order to progress to writing:

a) do we use a model fit score to compare dual and trial to decide which one we discuss in the main text or do we just decide that one of them makes more theoretical sense and discuss that one

I'm leaning towards the Dual model but my theoretical knowledge here is lacking. In particular I have little understanding of what the Autotyp areas are, or why we might want to measure their effect or account for them.

b) if we do use a model fit score, which one: CPO, PIT, DIC, milk or WAIC (discussed in #32 )

As I talk about in https://github.com/grambank/grambank-analysed/issues/32#issuecomment-1170037705, I think mlik (more specifically, Bayes Factors) might be the best for this. If not this, I think CPO is actually the simplest to explain from a theoretical perspective, it has a very nice interpretation as a measure of probability. The problem with WAIC is this concept of the 'effective number of parameters', which is difficult to wrap your head around. PIT is not suitable to compare many models, I don't think, and it is conceptually not easy to understand (I haven't really worked it out yet personally).

c) if we use the trial for the main text, we need to adjust the visualisations to show all three effects (can be discussed after (a) in #44 ) d) if we use the trial model for the main text, we need to provide more bulk on what the reasoning is behind the AUTOTYP-areas (@SimonGreenhill )

We could adjust the figures to include AUTOTYP (perhaps using @SamPassmore's idea of a ternary plot), or, depending on how AUTOTYP is viewed, we could treat it as a 'nuisance' parameter, just something to be 'accounted for', and then still only plot the phylogenetic and spatial effects. Then just make it clear that the plot shows phylogenetic and spatial effects, conditional on AUTOTYP.

HedvigS commented 2 years ago

Thanks @rdinnager ! Okay, let's see if anyone else thinks any different but probably let's just focus on the dual model then.

a) we thought to include AUTOTYP-areas on the suggestion of Simon as a way of conceptualising the space in a way that takes into account cultural contact better. As-the-crow flies is the same everywhere, but humans can't travel at the same ease etc everywhere. We could also address the same issue by some kind of cost-distances, but since linguists had already suggested these areas as relevant way of grouping contact history we used them. That's more or less the reasoning behind that, I hope I represented Simon correctly there.

b) mlik then, all good (then i'll def have to train my spell-check to not correct to milk :P ) 🥛

c) right, that's an interesting ideal. Use the trial model but mainly focus on the spatial and phylo-effects.... hm. that still leaves us with (d) - the need to explain how to interpret what it means to have the spatial effects when autotyp-area is also included.

rdinnager commented 2 years ago

It is also possible to use the trial model and if we think that AUTOTYP + dists is the "spatial effect", we could just combine the AUTOTYP and spatial distance effect into one summary effect. So for the spatial effect use something like: $$\sigma^2_s = \frac{A_s + D_s}{A_s + D_s + P + O}$$ where $A_s$ and $D_s$ are the AUTOTYP and spatial distance effect respectively, $P$ is the phylogenetic effect, and $O$ is the binomial correction term. We can do this easily using the posterior samples. Then we still have two measures to compare, the total spatial effect, and the phylogenetic effect.

Oh cool, this is the first time I'm trying out github's new math support. Looks good!

HedvigS commented 2 years ago

Nice! Yeah looks good.

I like that idea. I wonder what @SimonGreenhill thinks of it. Simon is away at the moment, and I really value his input if we'd use the trial model, so maybe until he is back fully let's just go with the dual model and see what writing we can get done based on that. It should (famous last words) be not hard to shift to the trial model.

rdinnager commented 2 years ago

Thanks for the additional information on AUTOTYP, it actually sounds really useful. Now that I have a better understanding, I think AUTOTYP + distances would be a good way to go, but I agree with you, let's wait to hear what @SimonGreenhill thinks first.

HedvigS commented 2 years ago

Sam, @QuentinAtkinson and I had a chat this morning and Q pointed out the different interpretation the AUTOTYP-area results have compared to the phylo and spatial effects. What we're sort of testing in a way with the AUTOTYP-areas is how good at finding cultural areas some linguists are, in a way.

We decided to progress with talking about the dual model only in the paper itself, but we'll leave the other 4 models in the scripts here in case any reviewer wants to see those kinds of results as well.

@SamPassmore is going to try out some different decay values for the spatial precision matrices as well.

SamPassmore commented 2 years ago

Hi Hedvig, I didn't quite figure out the appropriate decay parameters. But I am working on it in a new branch. I will get to it early next week if that is not too late for you.

HedvigS commented 2 years ago

@SamPassmore thanks! really appreciate it!

we'll get as much writing as we can in the meantime, hopefully this is just another roboustness check and won't shake our basic conclusions from this exercise which is that phylo matters more than space.

HedvigS commented 2 years ago

We're making good progress on the plot and text for the main text of the manuscript.

i'm now turning my gaze to S4 in which we outline more details of the INLA model. I've gone over it briefly now, whenever you have time this week @SamPassmore and @rdinnager I'd appreciate your input.

SamPassmore commented 2 years ago

I have gone over the text. I think it is mostly all there, although we can probably write things more pretty*.

I think to get things done quickly the best course of action is for 1) Hedvig to review my changes and comments. 2) I will go through tomorrow and make sure the writing is concise and clear. 3) We have @rdinnager make sure that we don't say anything wrong at the end.

Sound ok?

*I am reading David Sedaris' me talk pretty one day, and I couldn't resist making the same joke here. :D

HedvigS commented 2 years ago

That sounds very good to me @SamPassmore ! Thank you for the clear organisation and speedy actions, much appreciated.

HedvigS commented 2 years ago

We have several senior co-authors on the team who are good at talking pretty, let's first get a skeleton of correct facts and ideas in, then polish language and then give it to them for prettifying :)

grambank / grambank-analysed

collaboratively summing up and framing the INLA results in the paper #52