pthane / QP-Data

0 stars 0 forks source link

Revised Models #8

Open pthane opened 3 years ago

pthane commented 3 years ago

Hola @jvcasillas ,

I have been working through the feedback that Silvia gave me. She suggested that I adjust my models so that I can look at the frequency of the matrix verb AND of the subordinate verb. This was the original plan, and after a careful review of my data, I have been able to recode in order to accommodate this recommendation.

Silvia made a comment that she thinks that it's problematic if I include proficiency in the L2 models instead of frequency of use, and vice versa (no DELE in the HS models). That's a good point. Hence, if I am trying to look for DELE, frequency of use, token frequency of the matrix verb, token frequency of the subordinate verb, AND the two-way interactions, this seems like wayyyy too much based upon what we've discussed recently in the class. I would lose a bit of predictive accuracy.

I would like to know if you would recommend either (or both) of the following options to address this:

  1. Get rid of the L2ers for the purposes of the QP and then concentrate on HS. I'll still need proficiency in the model for HS.
  2. Run separate models for the matrix and the subordinate verb. I don't predict that they will interact all that much, as frequency effects in the matrix item implies a totally different process than frequency effects in the subordinate item. The problem then is that I need 8 GLMMs: 2 tasks (production, comprehension) x 2 groups (HS and L2) x 2 verbs (matrix and subordinate).

I like option #2 for the sake of explanatory accuracy as I think it would give me a more valid look at each of these factors and wouldn't imply that they are interrelated. I don't think that they are, or at the very least, I don't think that an interaction between them can easily be explained through theory.

Does this sound like a good approach? Many thanks.

Patrick

jvcasillas commented 3 years ago

I'm about 2/3 through your manuscript. Once I finish I'll go through all you questions/comments again and respond here.

pthane commented 3 years ago

Hi Joseph,

Okay, that's fine. My only thing is that the discussion is going to change dynamically, so if it is beneficial not to read that part, I fully understand. It may not be a productive use if your time...

The other questions are not really relevant anymore (that is, the ones in the previous issue).

Thanks!

Patrick

jvcasillas commented 3 years ago

Ah ok. That's good to know. I can probably stop now. I'll send it tonight when I am home.

pthane commented 3 years ago

Hi Joseph,

Thanks so much. I figured I should clarify that because I don’t want you to put work into something that’s going to change. Since you have all read the manuscript so quickly, it turns out that there may be time for another round of revisions, pending your opinion on the question I posed above. Thanks so much in advance!

PT

jvcasillas commented 3 years ago

Hola @jvcasillas ,

I have been working through the feedback that Silvia gave me. She suggested that I adjust my models so that I can look at the frequency of the matrix verb AND of the subordinate verb. This was the original plan, and after a careful review of my data, I have been able to recode in order to accommodate this recommendation.

So she suggested you include another predictor. Ok.

Silvia made a comment that she thinks that it's problematic if I include proficiency in the L2 models instead of frequency of use, and vice versa (no DELE in the HS models). That's a good point. Hence, if I am trying to look for DELE, frequency of use, token frequency of the matrix verb, token frequency of the subordinate verb, AND the two-way interactions, this seems like wayyyy too much based upon what we've discussed recently in the class. I would lose a bit of predictive accuracy.

I'm not sure I follow (yet). Why is it a bad idea to include proficiency? Aside from there being a lot of predictors (you are very correct in being concerned about that), but what is the justification for not including it?

I would like to know if you would recommend either (or both) of the following options to address this:

  1. Get rid of the L2ers for the purposes of the QP and then concentrate on HS. I'll still need proficiency in the model for HS.

I have no problem with this. I think it is up to you. At the end of the day what is your QP about? (I only semi-seriously ask this questions). My point is you only need to be concerned with the things that will help answer your research questions.

  1. Run separate models for the matrix and the subordinate verb. I don't predict that they will interact all that much, as frequency effects in the matrix item implies a totally different process than frequency effects in the subordinate item.

If you do plan on including frequency effects in the matrix item I would not suggest a separate model. What would be the justification? You say you don't predict an interaction. That's good. You don't have to include an interaction, but it does sound like you (and Silvia) think it is a relevant predictor. If this is true, then your other model will be omitting a relevant predictor... (recall that this is worse than including an irrelevant predictor).

The problem then is that I need 8 GLMMs: 2 tasks (production, comprehension) x 2 groups (HS and L2) x 2 verbs (matrix and subordinate).

Do you though? I can see justification for 2 models (groups) if and only if you really want to include predictors for one model that you don't have for the other (DELE, no?). Otherwise, you could fit a single model if you really wanted to.

I guess my advice for you is to not overthink it too much. This is your QP. You know the literature quite well and you did a great job motivating the questions I just read in your paper. If those aren't your questions... well, I can't help there, but I have total confidence in your ability to work that out. I don't want to push back against what your chairs are telling you. That's not what I'm saying, but think carefully about adding/taking out predictors. Especially if you have theoretically motivated reasons to include them.

I like option #2 for the sake of explanatory accuracy as I think it would give me a more valid look at each of these factors and wouldn't imply that they are interrelated. I don't think that they are, or at the very least, I don't think that an interaction between them can easily be explained through theory.

Does this sound like a good approach? Many thanks.

Patrick

pthane commented 3 years ago

Hola @jvcasillas ,

I'm not sure I follow (yet). Why is it a bad idea to include proficiency? Aside from there being a lot of predictors (you are very correct in being concerned about that), but what is the justification for not including it?

This was Silvia's comment: Your HS models do not include proficiency. Should it be there? It is my understanding that the terms appearing in interactions should also be included as main effects in the model. If that is the case, you need to add "frequency of use" as a fixed effect as well. I know you said you're trying several things with respect to including different variables in your models, however, you should try to use the same ones for each of the groups (HS and L2ers) or you will have to motivate the decisions VERY WELL in the lit review and your discussion will be quite dense.

I understand her point, but I can't include so many things in a single model and have it actually be meaningful…

I have no problem with this. I think it is up to you. At the end of the day what is your QP about? (I only semi-seriously ask this questions). My point is you only need to be concerned with the things that will help answer your research questions.

I think Jen likes having the comparison. I can suggest removing them on the grounds of statistical analysis if it is an elegant solution for getting around the problem of having 987654321 predictors.

If you do plan on including frequency effects in the matrix item I would not suggest a separate model. What would be the justification? You say you don't predict an interaction. That's good. You don't have to include an interaction, but it does sound like you (and Silvia) think it is a relevant predictor. If this is true, then your other model will be omitting a relevant predictor... (recall that this is worse than including an irrelevant predictor).

True; we did that last week but regardless it is much easier for me to learn through doing (I promise I listen attentively! jaja)…

Do you though? I can see justification for 2 models (groups) if and only if you really want to include predictors for one model that you don't have for the other (DELE, no?). Otherwise, you could fit a single model if you really wanted to.

Fair enough, but if I run the models with fixed effects for matrix verb, sub verb, group, activation, DELE (5) and then interactions between group and the other 4 variables + group:sub:DELE, group:sub:activation, group:matrix:DELE, group:matrix:activation, I have 13 predictors. That's a LOT.

pthane commented 3 years ago

More importantly, thanks for your comments on the QP. They put my mind at ease because you make really insightful suggestions, and I really agree with the comment about model vs. hypothesis. I think it's a great point. I must say I've been feeling pretty frustrated with the end of this process because it is hard to sort out advisor feedback and make sense of so much information, but as I said, hearing you thought I had come up with some theoretically-motivated questions gives me a sense of reassurance. I/we are really lucky to have you.

jvcasillas commented 3 years ago

I'll come back to this in the morning, but i just to remind you that you don't have to include interactions. I remember now what Silvia was talking about (having the two-way and no main effect). She's right, that's no good, but having the main effect and no interaction is ok (like we emphasized in last weeks class). So you don't need to fit a crazy 13 predictor model. I'll think about it some more, but I think you'll probably end up with two.

pthane commented 3 years ago

Hi Joseph,

Thanks. I agree with and understand the premise of not including the interactions for the hell of it, but what I am trying to show is that token frequency and frequency of use of Spanish are interrelated for HS and proficiency and token frequency are related for L2ers. To me, the entire paper hinges on that. I'm not sure how to capture this without the interactions.

jvcasillas commented 3 years ago

So you hypothesized a three-way interaction for one group and a two-way interaction for the other? I think I missed that in the paper...

pthane commented 3 years ago

Hi Joseph,

No, not quite. What I mean to say is that token frequency interacts with overall amounts of Spanish use for heritage speakers, and token frequency interacts with proficiency for L2 learners. Put simply, I think that heritage speakers who use Spanish more frequently are not going to be all that susceptible to token frequency. On the other hand, as proposed in the AH, but as of yet untested, I think that there will be a greater effect for token frequency for those heritage bilinguals who don't use Spanish frequently.

Perhaps I should say that in the paper just like that... 😂

Therefore, I am not sure how to address these questions without a slew of interactions of at least two levels. If I were to use a single model, I would then need to add group as a factor to these interactions (which would make them 3-way).

jvcasillas commented 3 years ago

Hi Joseph,

No, not quite. What I mean to say is that token frequency interacts with overall amounts of Spanish use for heritage speakers, and token frequency interacts with proficiency for L2 learners. Put simply, I think that heritage speakers who use Spanish more frequently are not going to be all that susceptible to token frequency. On the other hand, as proposed in the AH, but as of yet untested, I think that there will be a greater effect for token frequency for those heritage bilinguals who don't use Spanish frequently.

Perhaps I should say that in the paper just like that... 😂

Nailed it! That makes sense and is exactly what you need to say. It makes it much more clear (to me) what you are trying to test.

Therefore, I am not sure how to address these questions without a slew of interactions of at least two levels. If I were to use a single model, I would then need to add group as a factor to these interactions (which would make them 3-way).

I still think you are looking at 2 omnibus (main) models. The outcome variable is binary in both cases, right? Here is another possibility for the HS (you haven't seen this before):

response ~ task + activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 
  (1 + activation | matrix_verb_ch ) + 
  (1 + activation | sub_verb_ch)

And then for the L2ers:

response ~ task + prof + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + prof | item) + 
  (1 + prof | matrix_verb_ch ) + 
  (1 + prof | sub_verb_ch)

Advantages: less complex in terms of fixed effects, less ad-hoc, extremely useful for understanding individual differences in general and especially for activation/prof Disadvantages: computationally costly (this model might not converge), will require some more explaining for this to make sense, leaves out AOA (you decided against that, right?), more exploratory in nature

Think about that. (it's what I would do). The other option seems to be:

response ~ task + activation + matrix_verb + sub_verb + matrix_verb:activation + sub_verb:activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 

(and then replace activation with prof for the L2ers.

Advantages: Specifically tests (in NHST sense) the activation/frequency relationship, more likely to converge, but you still might have to remove some of the random slopes (the things on the lefthand side of "|" in (1 + matrix_verb + sub_verb | subj ) Disadvantages: more complex fixed effects

jvcasillas commented 3 years ago

Also, for all of these models you are guaranteed to have multicollinearity. You have to center and/or standardize all the continuous predictors and then look at the correlation matrix from the model output.

pthane commented 3 years ago

Nailed it! That makes sense and is exactly what you need to say. It makes it much more clear (to me) what you are trying to test.

My undergrad advisor (who I love dearly) used to yell at me and shout, "so why don't you just SAY WHAT YOU MEAN?!" Funny how that works.

I still think you are looking at 2 omnibus (main) models. The outcome variable is binary in both cases, right? Here is another possibility for the HS (you haven't seen this before):

Okay, so I think we are violently agreeing. My understanding is that you were a proponent of a single model for each task that incorporated HS and L2ers. Now what I understand is that you are proposing a single model for each group that incorporates production and comprehension. Right? My issue is that if I really, truly want to accentuate the effects of each task, I might as well look at them separately (I think). To me, there is a fundamental difference between trying to capture how task affects the variance on one hand and looking at the variance within each task on the other (understanding that the factors that affect production and comprehension may be distinct). Perhaps this is something that is not correct, but this is how it settles in my brain.

response ~ task + activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 
  (1 + activation | matrix_verb_ch ) + 
  (1 + activation | sub_verb_ch)

And then for the L2ers:

response ~ task + prof + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + prof | item) + 
  (1 + prof | matrix_verb_ch ) + 
  (1 + prof | sub_verb_ch)

Advantages: less complex in terms of fixed effects, less ad-hoc, extremely useful for understanding individual differences in general and especially for activation/prof Disadvantages: computationally costly (this model might not converge), will require some more explaining for this to make sense, leaves out AOA (you decided against that, right?), more exploratory in nature

Age of acquisition needs to go, so I don't care. We all wanted to kill that part of the paper (but then again, I'm knowingly leaving out a predictor that accounts for variance, but it's not part of my revised research questions). The problem with age is that it is theoretically difficult to interpret. A main effect doesn't tell us whether age is relevant because it "interrupts" the product of acquisition or if it simply "changes" the way a bilingual converts input to intake. We can't know that, so on theoretical grounds, it seems acceptable to me to leave it out. Put simply, it's not that I want to deliberately leave out something important, it's that we can't really make sense of what that is.

response ~ task + activation + matrix_verb + sub_verb + matrix_verb:activation + sub_verb:activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 

(and then replace activation with prof for the L2ers.

Advantages: Specifically tests (in NHST sense) the activation/frequency relationship, more likely to converge, but you still might have to remove some of the random slopes (the things on the lefthand side of "|" in (1 + matrix_verb + sub_verb | subj ) Disadvantages: more complex fixed effects

I definitely want to understand how activation interacts with token frequency to account for variance. If that is something that I think is very important, would you suggest this model then? I know you suggest the other one, but it seems that option B is more likely to explore the interactions and option A is more likely to look at fixed effects. I kind of want to see both, but I'm more interested in the interaction because this is the "missing link" between research on the activation hypothesis and token frequency. We assume they are related and see that both can account for variance, but to date we have only tested one OR the other. If we want to work towards a use-oriented model of heritage languages, we need to know how these factors work TOGETHER.

The problem with both models is that they don't seem to address Silvia's comment that the same variables should be in the models for both groups. It sounds like you are not necessarily in agreement with that claim (and I'm really not sure; intuitively, I'm not seeing the importance of this, but she is super experienced and knowledgable). What would you say is standard practice? Do people run separate models for HS and L2 learners, or is that frowned upon? I haven't come across it per sé but I'm not an expert in the stats world (yet).

Also, for all of these models you are guaranteed to have multicollinearity. You have to center and/or standardize all the continuous predictors and then look at the correlation matrix from the model output.

I standardized all of the variables (generated z-scores) before submitting them to analysis. All of the token frequency, frequency of use, and proficiency data are standardized through a script I run each time I code new data, and then I run the analyses. Is this what you mean?

image
jvcasillas commented 3 years ago

Nailed it! That makes sense and is exactly what you need to say. It makes it much more clear (to me) what you are trying to test.

My undergrad advisor (who I love dearly) used to yell at me and shout, "so why don't you just SAY WHAT YOU MEAN?!" Funny how that works.

I still think you are looking at 2 omnibus (main) models. The outcome variable is binary in both cases, right? Here is another possibility for the HS (you haven't seen this before):

Okay, so I think we are violently agreeing. My understanding is that you were a proponent of a single model for each task that incorporated HS and L2ers. Now what I understand is that you are proposing a single model for each group that incorporates production and comprehension. Right? My issue is that if I really, truly want to accentuate the effects of each task, I might as well look at them separately (I think). To me, there is a fundamental difference between trying to capture how task affects the variance on one hand and looking at the variance within each task on the other (understanding that the factors that affect production and comprehension may be distinct). Perhaps this is something that is not correct, but this is how it settles in my brain.

I might have said that (don't remember), but I assume it was based on how I understood the tasks (at the time). If both variables are binary and the predictors are the same (and coded the same way) then it can make a lot of sense to include task. Why? Well, you say you truly want to accentuate the effects of each task... this is a principled way to do that. If there is a main effect of task then you have a justification for saying they behave differently on each task (otherwise you will merely be describing with nothing to back you up)... then you can refit splitting the task factor. If there is no difference, i.e., if the probability of responding with the subjunctive is the same across tasks, then you learn something as well and you get parameter estimates for everything else with even more data/power. Again, I'm indifferent. Just trying to give you the best info so you can make a decision.

response ~ task + activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 
  (1 + activation | matrix_verb_ch ) + 
  (1 + activation | sub_verb_ch)

And then for the L2ers:

response ~ task + prof + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + prof | item) + 
  (1 + prof | matrix_verb_ch ) + 
  (1 + prof | sub_verb_ch)

Advantages: less complex in terms of fixed effects, less ad-hoc, extremely useful for understanding individual differences in general and especially for activation/prof Disadvantages: computationally costly (this model might not converge), will require some more explaining for this to make sense, leaves out AOA (you decided against that, right?), more exploratory in nature

Age of acquisition needs to go, so I don't care. We all wanted to kill that part of the paper (but then again, I'm knowingly leaving out a predictor that accounts for variance, but it's not part of my revised research questions).

Well, it was, but it's not now, right? You set it up quite nicely in the paper. Moreover it was a nice way of testing plausible rival hypotheses (like we talked about in class). I won't push for this anymore, but remember that we shouldn't get too tied to our hypotheses/models/theories. To truly support them with evidence we have to try to kill them (err falsify them)!

The problem with age is that it is theoretically difficult to interpret. A main effect doesn't tell us whether age is relevant because it "interrupts" the product of acquisition or if it simply "changes" the way a bilingual converts input to intake. We can't know that, so on theoretical grounds, it seems acceptable to me to leave it out. Put simply, it's not that I want to deliberately leave out something important, it's that we can't really make sense of what that is.

Could one not say the same thing for increased activation? I think the only way one could rule out age effects is by controlling for them, or at a minimum adjusting for them. The fact that they are theoretically difficult to interpret isn't good justification for avoiding them. It's a cop out! That said, age could easily be a collider variable (rather than a confound), but that's why we do all this. I guess the practical side of me also worries about you having to rewrite so much of what you have already done. I am interested in seeing your intro once you have removed the age conundrum.

response ~ task + activation + matrix_verb + sub_verb + matrix_verb:activation + sub_verb:activation + 
  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + 

(and then replace activation with prof for the L2ers. Advantages: Specifically tests (in NHST sense) the activation/frequency relationship, more likely to converge, but you still might have to remove some of the random slopes (the things on the lefthand side of "|" in (1 + matrix_verb + sub_verb | subj ) Disadvantages: more complex fixed effects

I definitely want to understand how activation interacts with token frequency to account for variance. If that is something that I think is very important, would you suggest this model then?

Yes. I think this is what will be best for your qp.

I know you suggest the other one, but it seems that option B is more likely to explore the interactions and option A is more likely to look at fixed effects. I kind of want to see both, but I'm more interested in the interaction because this is the "missing link" between research on the activation hypothesis and token frequency. We assume they are related and see that both can account for variance, but to date we have only tested one OR the other. If we want to work towards a use-oriented model of heritage languages, we need to know how these factors work TOGETHER.

That sounds good. "SAY THAT!" lol

The problem with both models is that they don't seem to address Silvia's comment that the same variables should be in the models for both groups. It sounds like you are not necessarily in agreement with that claim (and I'm really not sure; intuitively, I'm not seeing the importance of this, but she is super experienced and knowledgable). What would you say is standard practice? Do people run separate models for HS and L2 learners, or is that frowned upon? I haven't come across it per sé but I'm not an expert in the stats world (yet).

I agree with her. That is what would make the most sense, but my understanding (again I might be wrong) is that you don't have all the same variables for both groups, so you can't do much about that.

Also, for all of these models you are guaranteed to have multicollinearity. You have to center and/or standardize all the continuous predictors and then look at the correlation matrix from the model output.

I standardized all of the variables (generated z-scores) before submitting them to analysis. All of the token frequency, frequency of use, and proficiency data are standardized through a script I run each time I code new data, and then I run the analyses. Is this what you mean?

Yes. Precisely.

pthane commented 3 years ago

I might have said that (don't remember), but I assume it was based on how I understood the tasks (at the time). If both variables are binary and the predictors are the same (and coded the same way) then it can make a lot of sense to include task. Why? Well, you say you truly want to accentuate the effects of each task... this is a principled way to do that. If there is a main effect of task then you have a justification for saying they behave differently on each task (otherwise you will merely be describing with nothing to back you up)... then you can refit splitting the task factor. If there is no difference, i.e., if the probability of responding with the subjunctive is the same across tasks, then you learn something as well and you get parameter estimates for everything else with even more data/power. Again, I'm indifferent. Just trying to give you the best info so you can make a decision.

Well, I don't think you said that, I just thank that that is what I understood. Good point. The one thing I don't understand is "refitting splitting the task factor."

Well, it was, but it's not now, right? You set it up quite nicely in the paper. Moreover it was a nice way of testing plausible rival hypotheses (like we talked about in class). I won't push for this anymore, but remember that we shouldn't get too tied to our hypotheses/models/theories. To truly support them with evidence we have to try to kill them (err falsify them)!

Thanks! My gist is that Silvia did not feel it was well-justified in the paper for a few reasons, partially because she is one of the people who works on the Activation Hypothesis and, as she states, incomplete acquisition (IA) is not a hypothesis but rather an attempted explanation without clear ability to interpret age effects. I really liked the way that Jen coached me to restructure the intro, but I also really understand where Silvia is coming from (and tend to agree). This was her comment:

I was having a hard time with the inclusion of English AoA for a couple of reasons: the first one is that it is a bit complicated to analyze its effects unless you adopted David approach reported in his Languages article. This consists in making sure that sequential bilinguals are not more proficient than simultaneous ones. Such a thing can only be controlled if you make a split in the data (I know this is not ideal with continuous variables, but it is sometimes justified when dividing up the groups prevents potential misanalysis (i.e. if sequential HSs were inherently more proficient than simultaneous HSs, it's possible that what you saw as an AoA effect could inadvertently have been a proficiency effect instead). The other issue was the connection between AoA and IA, which I think muddled your hypothesis/discussion a little bit (especially given the presence of L2ers). Without longitudinal data (or at least w/o data where you control that proficiencies among sequential/simultaneous are comparable) you can't really make any claims about whether it is a case of IA or sth else.

I happen to agree with Silvia in this sense, and I feel that exploring the interaction between token frequency and frequency of use is fundamentally a bit of a different purpose than looking at age vs. use/activation.

Yes. I think this is what will be best for your qp.

Got it.

That sounds good. "SAY THAT!" lol

Jajaja aprendes bien. Feel free to yell that at me whenever you want; it helps.

I agree with her. That is what would make the most sense, but my understanding (again I might be wrong) is that you don't have all the same variables for both groups, so you can't do much about that.

I do. I have use and DELE data for everyone. My predictions were that overall amount of language use would modulate the degree to which HS are sensitive to token frequency effects, while proficiency would modulate the degree to which L2 learners are sensitive to frequency effects. In other words, that would be spelled out through two different interactions. What I took the liberty in doing (and what maybe I shouldn't have done) is only looking at DELE:token for L2ers and at use/activation: token for HS. This is what I think Silvia was critiquing.

jvcasillas commented 3 years ago

I might have said that (don't remember), but I assume it was based on how I understood the tasks (at the time). If both variables are binary and the predictors are the same (and coded the same way) then it can make a lot of sense to include task. Why? Well, you say you truly want to accentuate the effects of each task... this is a principled way to do that. If there is a main effect of task then you have a justification for saying they behave differently on each task (otherwise you will merely be describing with nothing to back you up)... then you can refit splitting the task factor. If there is no difference, i.e., if the probability of responding with the subjunctive is the same across tasks, then you learn something as well and you get parameter estimates for everything else with even more data/power. Again, I'm indifferent. Just trying to give you the best info so you can make a decision.

Well, I don't think you said that, I just thank that that is what I understood. Good point. The one thing I don't understand is "refitting splitting the task factor."

It means you refit the model separating by task (as you proposed). Essentially it is just a check to see if it is necessary/justified.

Well, it was, but it's not now, right? You set it up quite nicely in the paper. Moreover it was a nice way of testing plausible rival hypotheses (like we talked about in class). I won't push for this anymore, but remember that we shouldn't get too tied to our hypotheses/models/theories. To truly support them with evidence we have to try to kill them (err falsify them)!

Thanks! My gist is that Silvia did not feel it was well-justified in the paper for a few reasons, partially because she is one of the people who works on the Activation Hypothesis and, as she states, incomplete acquisition (IA) is not a hypothesis but rather an attempted explanation without clear ability to interpret age effects. I really liked the way that Jen coached me to restructure the intro, but I also really understand where Silvia is coming from (and tend to agree). This was her comment:

I was having a hard time with the inclusion of English AoA for a couple of reasons: the first one is that it is a bit complicated to analyze its effects unless you adopted David approach reported in his Languages article. This consists in making sure that sequential bilinguals are not more proficient than simultaneous ones. Such a thing can only be controlled if you make a split in the data (I know this is not ideal with continuous variables, but it is sometimes justified when dividing up the groups prevents potential misanalysis (i.e. if sequential HSs were inherently more proficient than simultaneous HSs, it's possible that what you saw as an AoA effect could inadvertently have been a proficiency effect instead). The other issue was the connection between AoA and IA, which I think muddled your hypothesis/discussion a little bit (especially given the presence of L2ers). Without longitudinal data (or at least w/o data where you control that proficiencies among sequential/simultaneous are comparable) you can't really make any claims about whether it is a case of IA or sth else.

Ah yes. I remember this comment now. I don't really agree but it's not relevant.

I happen to agree with Silvia in this sense, and I feel that exploring the interaction between token frequency and frequency of use is fundamentally a bit of a different purpose than looking at age vs. use/activation.

Yes. I think this is what will be best for your qp.

Got it.

That sounds good. "SAY THAT!" lol

Jajaja aprendes bien. Feel free to yell that at me whenever you want; it helps.

I agree with her. That is what would make the most sense, but my understanding (again I might be wrong) is that you don't have all the same variables for both groups, so you can't do much about that.

I do. I have use and DELE data for everyone. My predictions were that overall amount of language use would modulate the degree to which HS are sensitive to token frequency effects, while proficiency would modulate the degree to which L2 learners are sensitive to frequency effects. In other words, that would be spelled out through two different interactions. What I took the liberty in doing (and what maybe I shouldn't have done) is only looking at DELE:token for L2ers and at use/activation: token for HS. This is what I think Silvia was critiquing.

Ok. This is what I didn't really understand. I thought you didn't include the same variables because you didn't have them. I think you need to test the same model for both groups.

pthane commented 3 years ago

I might have said that (don't remember), but I assume it was based on how I understood the tasks (at the time). If both variables are binary and the predictors are the same (and coded the same way) then it can make a lot of sense to include task. Why? Well, you say you truly want to accentuate the effects of each task... this is a principled way to do that. If there is a main effect of task then you have a justification for saying they behave differently on each task (otherwise you will merely be describing with nothing to back you up)... then you can refit splitting the task factor. If there is no difference, i.e., if the probability of responding with the subjunctive is the same across tasks, then you learn something as well and you get parameter estimates for everything else with even more data/power. Again, I'm indifferent. Just trying to give you the best info so you can make a decision.

Well, I don't think you said that, I just thank that that is what I understood. Good point. The one thing I don't understand is "refitting splitting the task factor."

It means you refit the model separating by task (as you proposed). Essentially it is just a check to see if it is necessary/justified.

Well, it was, but it's not now, right? You set it up quite nicely in the paper. Moreover it was a nice way of testing plausible rival hypotheses (like we talked about in class). I won't push for this anymore, but remember that we shouldn't get too tied to our hypotheses/models/theories. To truly support them with evidence we have to try to kill them (err falsify them)!

Thanks! My gist is that Silvia did not feel it was well-justified in the paper for a few reasons, partially because she is one of the people who works on the Activation Hypothesis and, as she states, incomplete acquisition (IA) is not a hypothesis but rather an attempted explanation without clear ability to interpret age effects. I really liked the way that Jen coached me to restructure the intro, but I also really understand where Silvia is coming from (and tend to agree). This was her comment: I was having a hard time with the inclusion of English AoA for a couple of reasons: the first one is that it is a bit complicated to analyze its effects unless you adopted David approach reported in his Languages article. This consists in making sure that sequential bilinguals are not more proficient than simultaneous ones. Such a thing can only be controlled if you make a split in the data (I know this is not ideal with continuous variables, but it is sometimes justified when dividing up the groups prevents potential misanalysis (i.e. if sequential HSs were inherently more proficient than simultaneous HSs, it's possible that what you saw as an AoA effect could inadvertently have been a proficiency effect instead). The other issue was the connection between AoA and IA, which I think muddled your hypothesis/discussion a little bit (especially given the presence of L2ers). Without longitudinal data (or at least w/o data where you control that proficiencies among sequential/simultaneous are comparable) you can't really make any claims about whether it is a case of IA or sth else.

Ah yes. I remember this comment now. I don't really agree but it's not relevant.

So you say that on theoretical grounds, it's best to have age there. Hmmm interesting. I do agree with her on the methodological and theoretical grounds that making conclusions about age of acquisition is difficult in the absence of longitudinal data, because we don't know whether HS had a structure at time X and then restructured it, or if it simply isn't in their repertoire. Age at the time of bilingualism is a number, and I have no idea whether the person had this structure or not. If we look at monolingual literature, children get the volitional subjunctive by age 2-3, so I see no reason to categorically conclude that most of my bilinguals should have incomplete acquisition. I also agree with Silvia regarding the fact that we have no idea about the proficiency levels in other studies that reported incomplete acquisition, like those of Montrul.

HOWEVER, I completely agree with you that the continuous nature of proficiency in my analyses is not incompatible with age of acquisition. I just was trying to figure out how to deal with this.

So, let me ask you this: maybe it's time for me to go back to the drawing board and look at a correlation matrix, as you suggested. I'm trying very hard to measure a whole bunch of things, and I have the data to do that. I have continuously been coached by people in morphosyntax that less is more, and so my thought process was that addressing things like age vs. activation or matrix vs. subordinate verb should be accomplished in separate models because they refer to two very distinct research questions. I'm now learning that this is not necessarily a good idea (Type I error). Would you propose I look at ALL of this then? Or should I go and look for a correlation matrix and try to reformulate my ideas of what I want to look at?

Ok. This is what I didn't really understand. I thought you didn't include the same variables because you didn't have them. I think you need to test the same model for both groups.

OK, so then there's no way around a sh*t ton of predictors, right? Should I think about using Bonferroni corrections? How, then, should I integrate that into the model you proposed? I am still unclear if the (1 | + XXX) is a random effect or not.

jvcasillas commented 3 years ago

I might have said that (don't remember), but I assume it was based on how I understood the tasks (at the time). If both variables are binary and the predictors are the same (and coded the same way) then it can make a lot of sense to include task. Why? Well, you say you truly want to accentuate the effects of each task... this is a principled way to do that. If there is a main effect of task then you have a justification for saying they behave differently on each task (otherwise you will merely be describing with nothing to back you up)... then you can refit splitting the task factor. If there is no difference, i.e., if the probability of responding with the subjunctive is the same across tasks, then you learn something as well and you get parameter estimates for everything else with even more data/power. Again, I'm indifferent. Just trying to give you the best info so you can make a decision.

Well, I don't think you said that, I just thank that that is what I understood. Good point. The one thing I don't understand is "refitting splitting the task factor."

It means you refit the model separating by task (as you proposed). Essentially it is just a check to see if it is necessary/justified.

Well, it was, but it's not now, right? You set it up quite nicely in the paper. Moreover it was a nice way of testing plausible rival hypotheses (like we talked about in class). I won't push for this anymore, but remember that we shouldn't get too tied to our hypotheses/models/theories. To truly support them with evidence we have to try to kill them (err falsify them)!

Thanks! My gist is that Silvia did not feel it was well-justified in the paper for a few reasons, partially because she is one of the people who works on the Activation Hypothesis and, as she states, incomplete acquisition (IA) is not a hypothesis but rather an attempted explanation without clear ability to interpret age effects. I really liked the way that Jen coached me to restructure the intro, but I also really understand where Silvia is coming from (and tend to agree). This was her comment: I was having a hard time with the inclusion of English AoA for a couple of reasons: the first one is that it is a bit complicated to analyze its effects unless you adopted David approach reported in his Languages article. This consists in making sure that sequential bilinguals are not more proficient than simultaneous ones. Such a thing can only be controlled if you make a split in the data (I know this is not ideal with continuous variables, but it is sometimes justified when dividing up the groups prevents potential misanalysis (i.e. if sequential HSs were inherently more proficient than simultaneous HSs, it's possible that what you saw as an AoA effect could inadvertently have been a proficiency effect instead). The other issue was the connection between AoA and IA, which I think muddled your hypothesis/discussion a little bit (especially given the presence of L2ers). Without longitudinal data (or at least w/o data where you control that proficiencies among sequential/simultaneous are comparable) you can't really make any claims about whether it is a case of IA or sth else.

Ah yes. I remember this comment now. I don't really agree but it's not relevant.

So you say that on theoretical grounds, it's best to have age there. Hmmm interesting. I do agree with her on the methodological and theoretical grounds that making conclusions about age of acquisition is difficult in the absence of longitudinal data, because we don't know whether HS had a structure at time X and then restructured it, or if it simply isn't in their repertoire. Age at the time of bilingualism is a number, and I have no idea whether the person had this structure or not. If we look at monolingual literature, children get the volitional subjunctive by age 2-3, so I see no reason to categorically conclude that most of my bilinguals should have incomplete acquisition. I also agree with Silvia regarding the fact that we have no idea about the proficiency levels in other studies that reported incomplete acquisition, like those of Montrul.

No. To be clear I'm saying that the issue is methodological. If age is part of a competing rival hypothesis... and the main one that your (old) literature review outlines... then you have to at a minimum control for it. Recall what happens when you include a standardized predictor in the model... it allows you to say something like "holding age constant, the effect of X on Y is Z". That is the only point I want to make.

HOWEVER, I completely agree with you that the continuous nature of proficiency in my analyses is not incompatible with age of acquisition. I just was trying to figure out how to deal with this.

So, let me ask you this: maybe it's time for me to go back to the drawing board and look at a correlation matrix, as you suggested. I'm trying very hard to measure a whole bunch of things, and I have the data to do that. I have continuously been coached by people in morphosyntax that less is more, and so my thought process was that addressing things like age vs. activation or matrix vs. subordinate verb should be accomplished in separate models because they refer to two very distinct research questions. I'm now learning that this is not necessarily a good idea (Type I error). Would you propose I look at ALL of this then? Or should I go and look for a correlation matrix and try to reformulate my ideas of what I want to look at?

I am always on the side of parsimony (like the people in morphosyntax). Honestly, it sounds like the issue is that what you want to do at this point is still exploratory. There is nothing wrong with that. Your suggestion makes sense (to look at the correlation matrix). The simpler, more parsimonious designs (the less is more designs) are the ones that are well planned and well controlled.

Ok. This is what I didn't really understand. I thought you didn't include the same variables because you didn't have them. I think you need to test the same model for both groups.

OK, so then there's no way around a sh*t ton of predictors, right? Should I think about using Bonferroni corrections? How, then, should I integrate that into the model you proposed? I am still unclear if the (1 | + XXX) is a random effect or not.

You only need bonferroni corrections for pairwise comparisons, if you do them. Luckily you dont have many categorical factors (that is when things get really bad), so I don't think you will need to do that. Anything in the () in the model formula is a random effect (slopes on the left of | and intercepts on the right).

pthane commented 3 years ago

No. To be clear I'm saying that the issue is methodological. If age is part of a competing rival hypothesis... and the main one that your (old) literature review outlines... then you have to at a minimum control for it. Recall what happens when you include a standardized predictor in the model... it allows you to say something like "holding age constant, the effect of X on Y is Z". That is the only point I want to make.

Fair. But when you say "holding age constant," how does that make age a predictor variable? How would I code it as a controlled predictor?

I am always on the side of parsimony (like the people in morphosyntax). Honestly, it sounds like the issue is that what you want to do at this point is still exploratory. There is nothing wrong with that. Your suggestion makes sense (to look at the correlation matrix). The simpler, more parsimonious designs (the less is more designs) are the ones that are well planned and well controlled.

I thought I knew what I was doing, but then someone suggested looking at age, and then another suggested not looking at age, and then someone suggested looking at the subordinate verb, etc. I am getting very good advice from all three of my advisors but I am also starting to get "played out" and I'm not sure exactly how to move forward. I like the correlation matrix idea, but it just seems like there's a constant change of focus on my part at this point.

You only need bonferroni corrections for pairwise comparisons, if you do them. Luckily you dont have many categorical factors (that is when things get really bad), so I don't think you will need to do that. Anything in the () in the model formula is a random effect (slopes on the left of | and intercepts on the right).

OK. So in the revised model you proposed, what was the reasoning for having those factors in random effects? I don't consider token frequency to be random…

jvcasillas commented 3 years ago

No. To be clear I'm saying that the issue is methodological. If age is part of a competing rival hypothesis... and the main one that your (old) literature review outlines... then you have to at a minimum control for it. Recall what happens when you include a standardized predictor in the model... it allows you to say something like "holding age constant, the effect of X on Y is Z". That is the only point I want to make.

Fair. But when you say "holding age constant," how does that make age a predictor variable? How would I code it as a controlled predictor?

Patrick! https://www.ds4ling.jvcasillas.com/slides/05_lm/03_mrc/#38

I am always on the side of parsimony (like the people in morphosyntax). Honestly, it sounds like the issue is that what you want to do at this point is still exploratory. There is nothing wrong with that. Your suggestion makes sense (to look at the correlation matrix). The simpler, more parsimonious designs (the less is more designs) are the ones that are well planned and well controlled.

I thought I knew what I was doing, but then someone suggested looking at age, and then another suggested not looking at age, and then someone suggested looking at the subordinate verb, etc. I am getting very good advice from all three of my advisors but I am also starting to get "played out" and I'm not sure exactly how to move forward. I like the correlation matrix idea, but it just seems like there's a constant change of focus on my part at this point.

You are getting good advice from two very competent people. I think you'll figure it out. Looking at the correlation matrix should help.

You only need bonferroni corrections for pairwise comparisons, if you do them. Luckily you dont have many categorical factors (that is when things get really bad), so I don't think you will need to do that. Anything in the () in the model formula is a random effect (slopes on the left of | and intercepts on the right).

OK. So in the revised model you proposed, what was the reasoning for having those factors in random effects? I don't consider token frequency to be random…

That's not what random means here (yeah stats terminology is fantastic).

pthane commented 3 years ago

Patrick! https://www.ds4ling.jvcasillas.com/slides/05_lm/03_mrc/#38

I thought I understood that when you taught it to us, but I DEFINITELY don't think I understand it now. I saw the "doing it in R" slide and I don't see where in the model this is "controlled" for (going up to slide 28). I promise I'm listening (as you know, I ask a lot of questions), but it's one thing to review and see that this is what you were talking about and another to know how to do this in R (the comment "multiply SE of b-weight by 2 and add/subtract to/from b-weight" isn't clear to me, if that's what you're referring to).

You are getting good advice from two very competent people. I think you'll figure it out. Looking at the correlation matrix should help.

I looked at the correlation matrix. It occurred to me that age is not something I can look at with the L2ers, because it's not relevant to them. This is what Silvia said, but now I'm putting 2 and 2 together and seeing that the general gist is that I need to have consistent models for the two groups. Hence, if I'm trying to get my models to have the same predictors for both groups, age wouldn't make sense.

That's not what random means here (yeah stats terminology is fantastic).

OK, but why wouldn't I just create a more standard model then (just for my own knowledge)? In other words, why not something like:

response ~ Token_Main + Token _ Sub + DELE + use + Token_Main:DELE + Token_Main:Use + Token_Sub:DELE + Token_Sub:Use

jvcasillas commented 3 years ago

Patrick! https://www.ds4ling.jvcasillas.com/slides/05_lm/03_mrc/#38

I thought I understood that when you taught it to us, but I DEFINITELY don't think I understand it now. I saw the "doing it in R" slide and I don't see where in the model this is "controlled" for (going up to slide 28). I promise I'm listening (as you know, I ask a lot of questions), but it's one thing to review and see that this is what you were talking about and another to know how to do this in R (the comment "multiply SE of b-weight by 2 and add/subtract to/from b-weight" isn't clear to me, if that's what you're referring to).

By doing multiple regression you are adjusting for (statistically controlling) the variables included. That's what all the semi partial R2 talk was about. Think about the definition of the intercept... the value of y when x = 0. Now think of it with one of your variables... say age_std. The value of y when age_std is 0 (remember age_std = 0 = the mean of the sample). Now interpret a parameter estimate like we did in clase.... holding A, B and C constant, a 1 unit increase in age_std is associated with a change in Y of PARAMETER ESTIMATE HERE. Or, holding B, C and age_std constant, a 1 unit increase in B is associated with a change in Y of... etc. etc.

That is the whole point of multiple regression.

You are getting good advice from two very competent people. I think you'll figure it out. Looking at the correlation matrix should help.

I looked at the correlation matrix. It occurred to me that age is not something I can look at with the L2ers, because it's not relevant to them. This is what Silvia said, but now I'm putting 2 and 2 together and seeing that the general gist is that I need to have consistent models for the two groups. Hence, if I'm trying to get my models to have the same predictors for both groups, age wouldn't make sense.

I dont see how age isn't relevant to them (in fact I thin it more relevant to them), but I don't think that matters at this point.

That's not what random means here (yeah stats terminology is fantastic).

OK, but why wouldn't I just create a more standard model then (just for my own knowledge)? In other words, why not something like:

response ~ Token_Main + Token _ Sub + DELE + use + Token_Main:DELE + Token_Main:Use + Token_Sub:DELE + Token_Sub:Use

I dont follow here. Are you saying why not exclude the random effects? If you dont know what a GLMM is (which makes sense) its kind of hard to discuss this. The simple answer is you have to include the random effects because you have repeated measures.

pthane commented 3 years ago

By doing multiple regression you are adjusting for (statistically controlling) the variables included. That's what all the semi partial R2 talk was about. Think about the definition of the intercept... the value of y when x = 0. Now think of it with one of your variables... say age_std. The value of y when age_std is 0 (remember age_std = 0 = the mean of the sample). Now interpret a parameter estimate like we did in clase.... holding A, B and C constant, a 1 unit increase in age_std is associated with a change in Y of PARAMETER ESTIMATE HERE. Or, holding B, C and age_std constant, a 1 unit increase in B is associated with a change in Y of... etc. etc.

That is the whole point of multiple regression.

I actually reached out to a few peers and I think maybe our heads haven't completely wrapped around this yet. I understand what you're saying in principle (I think), but I think I severely misinterpreted what you said before. I understood that you were advocating for me to "do" something with the age variable other than including it as a predictor in order to hold it constant. What I understood in class and what you are saying now is that multiple regression gives each predictor its "moment in the sun" by holding the other variables at 0. If I have misunderstood this, then I would say that I'm still confused about this concept.

I dont follow here. Are you saying why not exclude the random effects? If you dont know what a GLMM is (which makes sense) its kind of hard to discuss this. The simple answer is you have to include the random effects because you have repeated measures.

I certainly couldn't give you a complex definition of GLMM, but I know that it is the appropriate model for evaluating multiple predictors along continua and that it adjusts the intercepts by incorporating random effects. I have random effects for participant and item in my current model. What I meant to say is that I don't know why "subj" and the matrix and subordinate verbs would be in the random effects in the following model that you proposed.


  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + ```
jvcasillas commented 3 years ago

By doing multiple regression you are adjusting for (statistically controlling) the variables included. That's what all the semi partial R2 talk was about. Think about the definition of the intercept... the value of y when x = 0. Now think of it with one of your variables... say age_std. The value of y when age_std is 0 (remember age_std = 0 = the mean of the sample). Now interpret a parameter estimate like we did in clase.... holding A, B and C constant, a 1 unit increase in age_std is associated with a change in Y of PARAMETER ESTIMATE HERE. Or, holding B, C and age_std constant, a 1 unit increase in B is associated with a change in Y of... etc. etc.

That is the whole point of multiple regression.

I actually reached out to a few peers and I think maybe our heads haven't completely wrapped around this yet. I understand what you're saying in principle (I think), but how do you hold age of acquisition at 0 post-hoc?

How do you measure the MPG of a car with 0 weight? They are estimates generated by the model.

I guess I'm not having an issue with the fact that the point of multiple regression is to see what happens when we hold something constant, but my understanding from the previous post was that I actually needed to do something to the age variable to "take it out of play." If I understand what you're saying (and I think I very much misunderstood before), the only way to do this with age is by having it in the model. Otherwise we wouldn't be controlling that variable. Right?

Exactly.

I dont follow here. Are you saying why not exclude the random effects? If you dont know what a GLMM is (which makes sense) its kind of hard to discuss this. The simple answer is you have to include the random effects because you have repeated measures.

I certainly couldn't give you a complex definition of GLMM, but I know that it is the appropriate model for evaluating multiple predictors along continua and that it adjusts the intercepts by incorporating random effects. I have random effects for participant and item in my current model. What I meant to say is that I don't know why "subj" and the matrix and subordinate verbs would be in the random effects in the following model that you proposed.

  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + ```

In order to construct a maximal model any predictor that is within subject or within item can be given a random slope. This means you allow the effect to vary for that grouping variable and then the model uses partial pooling to get a better estimate of the fixed effects. In essence the model learns more about the data when you account for the data structure through random effects. Its like fitting smaller models inside your big model to help learn about the data. Its pretty complicated and we will only start to touch on it in class. You can thnk of it like "let the effect of matrix verb frequency vary for each participant and use this information to inform the fixed effect of matrix_verb for the population estimate.

pthane commented 3 years ago

How do you measure the MPG of a car with 0 weight? They are estimates generated by the model.

Right. They don't exist in nature.

I guess I'm not having an issue with the fact that the point of multiple regression is to see what happens when we hold something constant, but my understanding from the previous post was that I actually needed to do something to the age variable to "take it out of play." If I understand what you're saying (and I think I very much misunderstood before), the only way to do this with age is by having it in the model. Otherwise we wouldn't be controlling that variable. Right?

Exactly.

OK. As has become very evident throughout this process, my issue with stats so far has been linking a concept to practice. It appears to me that I grasped this concept in class, but for some reason I had absolutely no idea what you were talking about earlier when it had to do with my own research. I think we can both agree that this happens to me a lot. Do you have any tips or strategies for "linking" the knowledge of how stuff works to my own practice?

I certainly couldn't give you a complex definition of GLMM, but I know that it is the appropriate model for evaluating multiple predictors along continua and that it adjusts the intercepts by incorporating random effects. I have random effects for participant and item in my current model. What I meant to say is that I don't know why "subj" and the matrix and subordinate verbs would be in the random effects in the following model that you proposed.

  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + ```

In order to construct a maximal model any predictor that is within subject or within item can be given a random slope. This means you allow the effect to vary for that grouping variable and then the model uses partial pooling to get a better estimate of the fixed effects. In essence the model learns more about the data when you account for the data structure through random effects. Its like fitting smaller models inside your big model to help learn about the data. Its pretty complicated and we will only start to touch on it in class. You can thnk of it like "let the effect of matrix verb frequency vary for each participant and use this information to inform the fixed effect of matrix_verb for the population estimate.

That's suuuuuper cool. Takeaway message: be happy that I have this both within the random intercepts and as predictors. While I did not know this, and am happy to learn it, I think that the reason that I was confused is that I thought that "subj" meant subjunctive (not subject), and so I was wondering why you were putting the dependent variable into a random effect 🙄

jvcasillas commented 3 years ago

How do you measure the MPG of a car with 0 weight? They are estimates generated by the model.

Right. They don't exist in nature.

Exactly. That is why we take advantage of how the model works by centering/standardizing. Why waste a hypothesis (is the value of y when x is 0 > 0?) if it is nonsensical? If all predictors are standardized the hypothesis on the intercept becomes "is the value of y > 0 when all predictors are average in every way?", which is often much more useful.

I guess I'm not having an issue with the fact that the point of multiple regression is to see what happens when we hold something constant, but my understanding from the previous post was that I actually needed to do something to the age variable to "take it out of play." If I understand what you're saying (and I think I very much misunderstood before), the only way to do this with age is by having it in the model. Otherwise we wouldn't be controlling that variable. Right?

Exactly.

OK. As has become very evident throughout this process, my issue with stats so far has been linking a concept to practice. It appears to me that I grasped this concept in class, but for some reason I had absolutely no idea what you were talking about earlier when it had to do with my own research. I think we can both agree that this happens to me a lot. Do you have any tips or strategies for "linking" the knowledge of how stuff works to my own practice?

It's totally normal. You get it through experience. I said at the beginning of the course that in an ideal world this class would be the first of many you take on stats. It just takes time to build a solid foundation. Accept that you won't know it all, but aim to know enough. Keep fitting models and slowly try to figure out more of what you don't get.

I certainly couldn't give you a complex definition of GLMM, but I know that it is the appropriate model for evaluating multiple predictors along continua and that it adjusts the intercepts by incorporating random effects. I have random effects for participant and item in my current model. What I meant to say is that I don't know why "subj" and the matrix and subordinate verbs would be in the random effects in the following model that you proposed.

  (1 + matrix_verb + sub_verb | subj ) + 
  (1 + activation | item) + ```

In order to construct a maximal model any predictor that is within subject or within item can be given a random slope. This means you allow the effect to vary for that grouping variable and then the model uses partial pooling to get a better estimate of the fixed effects. In essence the model learns more about the data when you account for the data structure through random effects. Its like fitting smaller models inside your big model to help learn about the data. Its pretty complicated and we will only start to touch on it in class. You can thnk of it like "let the effect of matrix verb frequency vary for each participant and use this information to inform the fixed effect of matrix_verb for the population estimate.

That's suuuuuper cool. Takeaway message: be happy that I have this both within the random intercepts and as predictors. While I did not know this, and am happy to learn it, I think that the reason that I was confused is that I thought that "subj" meant subjunctive (not subject), and so I was wondering why you were putting the dependent variable into a random effect 🙄

Multilevel models (another name for GLMMs) really are awesome. Very powerful and very misunderstood.

pthane commented 3 years ago

It's totally normal. You get it through experience. I said at the beginning of the course that in an ideal world this class would be the first of many you take on stats. It just takes time to build a solid foundation. Accept that you won't know it all, but aim to know enough. Keep fitting models and slowly try to figure out more of what you don't get.

Honestly I wish there were a self-study option to get credit for a second stats course. I'd totally do it, and I'm quite sure Kyle would too (entre otros).

Multilevel models (another name for GLMMs) really are awesome. Very powerful and very misunderstood.

I added those to the random effects. Thanks for the tips. I have three very brief questions (at least I think they're brief) now that things have seemed to have sorted themselves out:

  1. Silvia wants me to report odds ratios in the manuscript. How do I do this?
  2. Silvia suggested that I verify that the distribution of participants by proficiency level across groups are similar. I assume she was referring to a t-test but she didn't specify. Is this what you would recommend?
  3. If I report separate models after running the initial model with task, what do I say in the manuscript? Should I just say that I ran all three models and report them separately? Or should I say "after running model X, two models were run, one for each task?"

Thanks a million, Joseph!

jvcasillas commented 3 years ago

It's totally normal. You get it through experience. I said at the beginning of the course that in an ideal world this class would be the first of many you take on stats. It just takes time to build a solid foundation. Accept that you won't know it all, but aim to know enough. Keep fitting models and slowly try to figure out more of what you don't get.

Honestly I wish there were a self-study option to get credit for a second stats course. I'd totally do it, and I'm quite sure Kyle would too (entre otros).

Multilevel models (another name for GLMMs) really are awesome. Very powerful and very misunderstood.

I added those to the random effects. Thanks for the tips. I have three very brief questions (at least I think they're brief) now that things have seemed to have sorted themselves out:

  1. Silvia wants me to report odds ratios in the manuscript. How do I do this?

That's kind of silly, but ok. The log-odds are a logarithm of the odds, so you just have to exponentiate to back-transform: so if you have a log-odds of -2, then exp(-2) would give you an odds ratio of 0.13.

  1. Silvia suggested that I verify that the distribution of participants by proficiency level across groups are similar. I assume she was referring to a t-test but she didn't specify. Is this what you would recommend?

No. That wouldn't tell you that. I would just plot them, but If she is asking for a test showing that they are not different from each other you need a TOST.

  1. If I report separate models after running the initial model with task, what do I say in the manuscript? Should I just say that I ran all three models and report them separately? Or should I say "after running model X, two models were run, one for each task?"

Fit the omnibus model and report the main effect of task. If it is significant you would say something like "the participants' responses varied as a function of task (XXXXX). The data from each task were refit and are reported separately."

Thanks a million, Joseph!