Current version of paper on overleaf

jensroes commented 4 months ago

@Mark-Torrance I've added a version of the draft on overleaf and invited you.

I think the methods and results are alright although slightly long and I don't have materials on OSF yet.

I'm working on the introduction at the moment on the basis of what I have from Linda. Linda's version of the discussion is commented out on overleaf too if you're interested.

I suggest targeting PBR as a short paper btw.

jensroes commented 4 months ago

@Mark-Torrance feel free to have a go at the introduction :) I have reduced Linda's introduction substantially manly by removing examples and I've added in some bits from the Leverhulme grant application (actually from older versions of it).

Mark-Torrance commented 4 months ago

I don't understand this:

"First, if adjacent and nonadjacent dependencies are learnt using the same cognitive mechanisms, we expect participants to anticipate target locations (dots) more successfully when adjacent and nonadjacent dependencies are presented in separate blocks than when presented concurrently."

Why?

jensroes commented 4 months ago

I knew you would pickup on this. This was kinda a place holder. I'm not quite sure what the predictions are from whether adjacent and nonadjacent dependencies use the same underlying mechanism.

So my guess is something like this:

if different mechanisms are used it shouldn't be a problem to learn both dependency types concurrently, possibly with a delay for nonadjacent dependencies but learning one dependency type shouldn't affect learning of the other.
if the same mechanism is used to learn both dependencies we should observe the same results in mixed presentation than in blocked presentation of dependency types (contrary to what I wrote) OR it might be easier in the mixed version cause there are only 2 adjacent dependencies instead of 4 (in adjacent only blocks) and only 2 nonadjacent depdencies instead of 4 (in nonadjacent only blocks).

Mark-Torrance commented 4 months ago

if different mechanisms are used it shouldn't be a problem to learn both dependency types concurrently, possibly with a delay for nonadjacent dependencies but learning one dependency type shouldn't affect learning of the other.

if the same mechanism is used to learn both dependencies we should observe the same results in mixed presentation than in blocked presentation of dependency types (contrary to what I wrote).

So the two-mechanism theory and the one-mechanism theory make the same predictions for mixed presentation and for blocked presentation?

But before we go any further with theory, can I check my understanding:

Sequences in the adjacent condition were ABR and in the non adjacent were ARB, where R is a random location? We never have RAB in the adjacent condition?

jensroes commented 4 months ago

Sequences in the adjacent condition were ABR and in the non adjacent were ARB, where R is a random location? We never have RAB in the adjacent condition?

Yes. This is a bit weird cause I remember that I argued to have ARB vs RAB on some point but I think there was an argument to keep the location of the dependee constant. I think in practice it doesn't matter because the cues are presented continuously.

So the two-mechanism theory and the one-mechanism theory make the same predictions for mixed presentation and for blocked presentation?

I'm not quite sure to be honest. I can see how under the same and different mechanism view you'd expect the same pattern in mixed and block presentation but also that only the same mechanism view would predict changes in the concurrent presentation.

Mark-Torrance commented 4 months ago

Can I check baseline?

For non adjacent ARBCRD it would be anticipation of the R's but not C

For adjacent ABRCDR again, it would be anticipation of the R's but not C

jensroes commented 4 months ago

Baseline is all elements that are not part of a dependency, so neither the first part or second part, but all locations that have a random position (any other position that those used for dependencies.

Mark-Torrance commented 4 months ago

Ok, so these include Cs in the examples above? In which case we have an issue, I think.

For ARBC and ABRC - where C is a specific, unknown location the first item in the next sequence

P(R|A) = P(R|B) = 1/15

So same for both conditions. But for the C (i.e. start of next sequence):

P(C|R) = 1/15 (for the adjacent condition)

P(C|B) = 1/4 (for the non-adjacent condition)

So if we include the starts of new sequences in the baseline, this will make anticipation in the baseline condition more probably for non-adjacent than adjacent.

Not sure how much this matters. But it's weird that we seem to get above-chance learning in the baseline condition, and this might explain it.

jensroes commented 4 months ago

Ok, so these include Cs in the examples above? In which case we have an issue, I think.

I don't understand what Cs are. All we have is random locations and locations that are part of a dependency.

where C is a specific, unknown location the first item in the next sequence

THis would mean that Cs are not included as they are part of a dependency.

jensroes commented 4 months ago

I'm reading from Linda's disucssion that the reduced learning effect in the mixed presentation could be due to inference of two different cognative mechnaisms for adjacent and nonadjacent dependencies. This is an option for the prediction but unlikely for our data cause we don't observe evidence for nonadajcent dependency leanring at all.

Mark-Torrance commented 4 months ago

C is the first item of the next sequence. Ah, so these are included in "part of a dependency". Good. No issue then.

Mark-Torrance commented 4 months ago

Stating the obvious: There's some work to do here to create coherent argument from our findings. This is what I'm thinking:

The interesting findings here relates to effects of mixing in non-adjacent with adjacent associations. The fact that we are not getting any evidence for learning in the non-adjacent condition, but this still affects learning in the adjacent condition, is important. So the focal effect is the decrease in learning in the adjacent condition in Experiment 3 (mixed) compared to Experiment 2 (not mixed). We'll ignore the fact that this is across experiments, and might just be a sampling difference.
This finding is in overall learning. There is no evidence of an adjacency*time effect in Experiments 2 and 3. So, cool that we know about timecourse (I suppose) but no evidence of different patterns of learning over time. This is also clear from the plots. Which is probably a good thing because we don't have a prediction relating to timecourse.
Experiment 1 can just be seen as a check that our (slightly) explicit encouragement to look for patterns can't explain the results. It would be better if this was in the mixed condition, but it isn't.

On these assumptions we...

Present in the introduction yet-to-be-decided theories of statistical learning that are dissociated by the finding that mixing patterns of association decreases learning. So we need an account that predicts that mixing will make no difference and an account that predicts that mixing will make a difference. Gary might be able to help with this.
We present just current experiments 2 and 3 first with stand-alone methods results and two sentence discussion.
For Experiments 2 and 3 we foreground main effect (overall effect) on learning, presenting findings as a cell-mean profile (dots and PI bars). We also show the timecourse plots because, well, they're cool and it's a new method.
We present just the pooled analysis (because it does the same job as the by-experiment models anyway, but also allows cross-experiment comparison), and foreground the mixed (blocked vs. mixed) * adjacency (adjacent vs. non-adjecent vs. baseline) interaction. Parameterising the adjacency factor with treatment coding (comparing both with baseline.
In a very short final section we report current Experiment 1 as a check that under more implicit conditions we still get learning. No pooled analysis.

Assuming that we can do (1), then I think think that if we present in this way we might have a coherent paper. Though I'm still imagining struggles with reviewers. Whether or not it's worth putting this effort in is a different matter. I think for Karl's sake it might be worth pursuing. But if that wasn't the case then I think we have better uses for our time.

jensroes commented 4 months ago

I've changed the intro a little and wrote a discussion (I agree this needs work but every paper I've ever reviewed did too).

The interesting findings here relates to effects of mixing in non-adjacent with adjacent associations. The fact that we are not getting any evidence for learning in the non-adjacent condition, but this still affects learning in the adjacent condition, is important. So the focal effect is the decrease in learning in the adjacent condition in Experiment 3 (mixed) compared to Experiment 2 (not mixed).

Agreed

We'll ignore the fact that this is across experiments, and might just be a sampling difference.

Also with Linda we had this argument with the editor of Development Psychology. They seemed to accept that in some situations it is difficult or impossible to do everything in one experiment. I've provided a polled analysis across experiments that showed the interactions we need to suggest that this isn't just about sampling differences.

So, cool that we know about timecourse (I suppose) but no evidence of different patterns of learning over time. This is also clear from the plots. Which is probably a good thing because we don't have a prediction relating to timecourse.

Agreed. this is why I have deemphasised the timecourse component and just mentioned the new paradigm aside in the last paragraph of the introduction.

So we need an account that predicts that mixing will make no difference and an account that predicts that mixing will make a difference. Gary might be able to help with this.

Good idea.

We present just current experiments 2 and 3 first with stand-alone methods results and two sentence discussion.

I think this isn't a good idea because 1) too much changing around now and 2) 3 experiments looks stronger than 2.

For Experiments 2 and 3 we foreground main effect (overall effect) on learning, presenting findings as a cell-mean profile (dots and PI bars).

I have done this in the past but I think we don't need that because the main effect is clear from the timecourse plot.

We present just the pooled analysis (because it does the same job as the by-experiment models anyway, but also allows cross-experiment comparison),

I thought about this but reviewers are very likely to ask for the non-pooled analysis, I think and also we need the model comparisons within experiment

I think for Karl's sake it might be worth pursuing. But if that wasn't the case then I think we have better uses for our time.

I agree but I've also now spend so much time on this (analysis, training RA and Karl, supervising Karl, writing stuff, etc) that I'd like to go the final mile but without the usual effort :) After today I feel the paper got slightly more coherent because I think I see how the instructions manipulation is linked to the mixed / blocked design manipulation. PRB might be a long shot but we might as well submit to a strong journal before we submit it elsewhere.

jensroes commented 4 months ago

Present in the introduction yet-to-be-decided theories of statistical learning that are dissociated by the finding that mixing patterns of association decreases learning. So we need an account that predicts that mixing will make no difference and an account that predicts that mixing will make a difference. Gary might be able to help with this.

Shall we ask Gary first and give him a PDF of the current draft to see what he thinks?

Mark-Torrance commented 4 months ago

We present just current experiments 2 and 3 first with stand-alone methods results and two sentence discussion.

I think this isn't a good idea because 1) too much changing around now and 2) 3 experiments looks stronger than 2.

I'm not suggesting dropping Experiment 1. Just representing it as a methodological check, and so presenting as a stand-alone experiment last. Currently the results are hard to follow partly because they muddle together explicit / implicit (not important) with the adjacency condition. It took several readings by me to work out what was going on, and I understand the design.

For Experiments 2 and 3 we foreground main effect (overall effect) on learning, presenting findings as a cell-mean profile (dots and PI bars).

I have done this in the past but I think we don't need that because the main effect is clear from the timecourse plot.

I agree that the timecourse plots tell readers what they need to know but (a) they give more information than readers need and (b) the effect really doesn't look that impressive. This might be the only thing that reviewers properly look at, and it needs to point them strongly in the right direction. With a main effect plot you can make the effect look bigger (because you don't need y to go down to zero), and you also avoid awkward questions about what the timecourse is about. Also, also, currently the non-adjacent condition disappears behind the others.

We present just the pooled analysis (because it does the same job as the by-experiment models anyway, but also allows cross-experiment comparison),

I thought about this but reviewers are very likely to ask for the non-pooled analysis, I think and also we need the model comparisons within experiment.

Agreed. But you could demote the by-experiment analysis to an appendix. Again, the simpler the better.

I take your point about investment of your time. So this isn't just about Karl. But I do think investing a bit of time now in getting things are clear and as simple as possible saves effort in the long run. If it's rejected at least you get comments on the basis of reviewers who have understood that paper, so you have something to build on.

I'm happy to go with whatever you want to do here. I just think that there are some relatively easy changes that we could make that would substantially increase chances of success.

Mark-Torrance commented 4 months ago

Whatever the outcome of the above discussion, I'm looking at what you've written now (quickly). Then I agree, send to Gary.

jensroes commented 4 months ago

because they muddle together explicit / implicit (not important) with the adjacency condition.

From Linda's write up and from previous discussion I had with her, it seems that this is actually important. I struggled a bit to see the connection to dependency learning but although no theory seems to spell it out, there is evidence to support that adjacent learning is implicit and nonadjacent dependency learning is more likely to be observed in explicit contexts. In the discussion I tried to make an argument for why our data wouldn't support this view.

Personally I agree that the focus on mixed and blocked dependencies might be better. Lets see how Gary feels about this.

It took several readings by me to work out what was going on, and I understand the design.

Do you mean the results section? If so thats probably more about my writing than it is about the results. I thought the results are straight forward (even if not what one would predict).

With a main effect plot you can make the effect look bigger (because you don't need y to go down to zero), and you also avoid awkward questions about what the timecourse is about.

I generally agree that its probably enough to show the cell means (either after controlling for time course or at the end of the trial) but if we don't show the timecourse there would be immediately questions about the timecourse. In that case we should probably move the timecourse plot to the appendix.

Agreed. But you could demote the by-experiment analysis to an appendix. Again, the simpler the better.

I agree about the simpler the better but I think the prediction view the series of experiments really as a between subject design. The by-experiment interactions are important (and don't take up a lot of space).

Mark-Torrance commented 4 months ago

It took several readings by me to work out what was going on, and I understand the design.

Do you mean the results section? If so thats probably more about my writing than it is about the results. I thought the results are straight forward (even if not what one would predict).

Partly your writing. I think you could make more concessions to reader ignorance and lack of motivation, but largely I think because readers have to keep what is, across the three experiments, a complex design. I think if you are going to present results from all three experiments at once you make explaining the results more difficult.

I'm wondering if, without doing any more reanalysis, we could restructure this so that each experiment is described separately, then finish with the pooled analysis. I don't think this will take any extra words - just loads more headings - and it would substantially reduce the memory load on the reader. If we do this we can also switch the order so that Exp 1 become Exp 3.

If you think this is worth doing, I can make the changes. I suggest cloning the existing document, so I can refer back to it (if we do this).

jensroes commented 4 months ago

If you think this is worth doing, you can make those changes in the existing document (no need to close it and I have the overleaf license :)).

jensroes commented 4 months ago

Oh but this strategy would de-emphasise the role of implicit / explicit learning a lot. As I said above, I think this is an important factor from Linda's point of view but I can see advantages in remove it and in the beginning and probably still I'm not convinced by the importance of it.

Mark-Torrance commented 4 months ago

Oh but this strategy would de-emphasise the role of implicit / explicit learning a lot. As I said above, I think this is an important factor from Linda's point of view but I can see advantages in remove it and in the beginning and probably still I'm not convinced by the importance of it.

Under what I'm suggesting we can still report experiments in the existing order. And actually that would make sense. We can do "in experiment 1 we didn't find learning in the non-adjacent condition. This may be because our instructions were implicit. So we were more explicit in Exp 2." That was we don't have to discuss implicit / explicit in the introduction at all.

If you think that making the changes that I've suggested are worth it, then I'll do them. If you want to submit with the present structure, that's fine as well. You decide and I'll act :)

jensroes commented 4 months ago

Okay lets give this a go. I'll work on the Figure(s)

Mark-Torrance commented 4 months ago

Good. I am going to start a new version. This means I can easily copy and paste from the old one.

Mark-Torrance commented 4 months ago

Can I just check:

Plots are for cumulative hits during anticipation period across all four sequences?

In this figure

these are means, which should roughly correspond to cumulative hits at occurrence 20, right? Not at occurrence 40.

Do we have an explanation for why baseline in quite a bit above chance. Chance is 1/15 for occurrence of one sequence. So assuming these values aggregate across sequences chance at occurrence 20 is 1/15 20 4 = 5.33, and twice that at occurrence 40. We are finding 14, which is much higher than either of these.

jensroes commented 4 months ago

Plots are for cumulative hits during anticipation period across all four sequences?

Yes

which should roughly correspond to cumulative hits at occurrence 20, right? Not at occurrence 40.

I used the end of the learning sequence for this but the pattern is roughly the same across all occurrences only lower at the beginning than at the end (of course). Which doesn't explain why it is above chance

1/16 * 40 * 4

is 10. It needs to be 16, right?

I think the problem might be that this would assume that every dot has an equal a prior chance to be fixated but that's not true I think. The resting position of the eye would be for dots in the centre so I guess anything in the centre is more likely to be fixed also because the head constraint (which I think we used) might mean that they are less likely to fixate the corners of the screen cause it hurts :)

We agreed on some point that it is difficult to determine what chance means in this paradigm, which is the reason why we included a baseline measure as an empirical estimate for chance. Good news is, it's consistent across experiments..

Mark-Torrance commented 4 months ago

Plots are for cumulative hits during anticipation period across all four sequences?

Yes

which should roughly correspond to cumulative hits at occurrence 20, right? Not at occurrence 40.

I used the end of the learning sequence for this but the pattern is roughly the same across all occurrences only lower at the beginning than at the end (of course). Which doesn't explain why it is above chance

Good. This needs to be clear in captions. It might reduce confusion to plot per-sequence (i.e. divide by 4), or at least indicate in caption that this is aggregated across sequence.

It's 1/15 because next location can't be the same as current location.

I agree that there will be a bias towards dots in the center. But the random locations are, well, random. So if a participants always fixated the same dot throughout you would still get (an average of) 10.66 hits on random targets.

I've think I've got it, though. Probability of hitting random before learning is 1/15. But learning AB in ABR is going to set prior to include not A (probably). So 1/14. If you learn four sequences, then prior may exclude some or all dependent items. So this could reduce chance to 1/(16-8) = 1/8.

Now obviously this is dependent on them actually learning dependencies. Which they do (a bit) in the adjacent condition, and they don't in the non-adjacent condition.

Do you see where this is heading...? :)

I think we need to calculate baseline separately for adjacent and non-adjacent. Learning in non-adjacent does seem to be above random (10.6) but we are comparing with a baseline that's dependent on learning.

Sorry, I know you want to get this paper sorted quickly. These are important issues for the Leverhulme project, though. So they're worth thinking through.

jensroes commented 4 months ago

Sorry, I know you want to get this paper sorted quickly. These are important issues for the Leverhulme project, though. So they're worth thinking through.

It's not a bad thing that there is nothing like "quick" in our workflow and I agree that we might as well think about this now.

I think we need to calculate baseline separately for adjacent and non-adjacent. Learning in non-adjacent does seem to be above random (10.6) but we are comparing with a baseline that's dependent on learning.

Okay I see your point and I think I did that on some point where I just modelled all locations as dependee, dependant, random. Question is do we think this is worth doing now or is this something we give reviewers to pick up on :)

jensroes commented 4 months ago

Actually this wouldn't even be a big change in the code, just take a bit of time cause the model's aren't quick.

Mark-Torrance commented 4 months ago

I think I did that on some point where I just modelled all locations as dependee, dependant, random.

But that wouldn't do it. What you need is dependent (adjacent), dependent (non-adjacent), baseline (adjacent), baseline (non-adjacent). Obviously this is only possible for Exp 1 and Exp 2. So dependent vs. random become a fixed effect, and we are interested in interaction with adjacent vs. non-adjacent.

jensroes commented 4 months ago

Sorry yes I got that.

Obviously this is only possible for Exp 1 and Exp 2.

Then I misunderstood you. The random one in the mixed presentation can either have the role of a dependency intruder (nonadjacent) or follow a adjacent dependency. Wouldn't it be important to distinguish these two options?

Mark-Torrance commented 4 months ago

Obviously this is only possible for Exp 1 and Exp 2.

Then I misunderstood you. The random one in the mixed presentation can either have the role of a dependency intruder (nonadjacent) or follow a adjacent dependency. Wouldn't it be important to distinguish these two options?

I think the rationale for looking at R in adjacent and non-adjacent is independent of the location of R (where R is a random location). So in the account that I was developing, you could have a sequence that went ABRCDRRRRRRRRR and probability of hits on any of those Rs would be dependent on learning of AB and CD (because it's possible that learning a dependency removes its items from the pool of candidates for non-dependents. This effect isn't specific to Rs that are just in ABR or ARB sequence positions.

Ah, but now I'm questioning my account. Can you remember how we (I guess it was originally me) chose random locations? For sequence ABR is is possible that R is actually location A (or any of the other locations involved in dependencies?

If so, then although we might have participants whose prior has reduced to just 8 locations, this isn't going to help them (i.e. isn't going to increase hits), because random is actually chosen from 15 locations.

jensroes commented 4 months ago

Can you remember how we (I guess it was originally me) chose random locations? For sequence ABR is is possible that R is actually location A (or any of the other locations involved in dependencies?

That must have happened before my time. I assumed the whole time that random locations are dots that are not part of a sequence.

Mark-Torrance commented 4 months ago

Ok, I'll check. This will be down to whoever constructed the trial data. Which might have been me. Is the location date (the dot number) in your data? If so, the easiest way to check if just by looking at that. Can you paste a data file here?

jensroes commented 4 months ago

Do you mean this? trials.csv

This was used in exp 1 and 2

Mark-Torrance commented 4 months ago

Could you let me have trials.csv for Exp 3?

I've check Exp 1 and 2. Random were chosen from locations not included as dependents in any sequences. So as learning of dependencies increases randoms become increasingly predictable.

This explains learning for random locations. Participants are slowly learning the set from which these are randomly drawn.

It does mean that we need to model baseline separately for adjacent and non-adjacent. We should be observing lower learning in baseline in the non-adjacent condition. Relative to this new baseline in the non-adjacent condition we might find some evidence of learning in the adjacent condition.

We can't separate out X for adjacent and non-adjacent in Exp 3. I know that we can identify X separately for adjacent and non-adjacent, but the mechanism here is that learning of any dependency also contributes to learning the set from which all X's are drawn.

jensroes commented 4 months ago

trials_exp3.csv

jensroes commented 4 months ago

I'm trying to work out if this makes sense and / or if I made a mistake cause I think the results below seem weird. This is just for exp 1.

I've made the following changes to the model:

calculated the baseline separately as discussed.
used Poisson model instead of a zero inflated binomial. I used the zero inflated model in earlier versions of this analysis but I don't think this is appropriate for the cumulative sum version of the data. I'm checking whether Poisson or Neg-binomial is better but I doubt this makes a difference overall.
changed the growth curve function to a log unction because a log function increases exponentially and then asymptotes when there is no more increase which is what we need and which shows a much better fit than a linear and quadratic function. But the pattern below is something we see for both quadratic and log functions, so this doesn't example the pattern.
I've removed from the random slopes a factor that I think has previously just overparametrised the model, meaning I think it was an overkill and took the models too long to run. I had included a random slopes term for each dot by location set (also occurrence) saying something learning for any dot might be better in some participants than in others. I don't think anymore this is the right thing to do. Instead I've random slopes for occurrence only because we assume that some participants learn better than others but we have no assumptions really that which dots are learnt better varies by participant.

So if the results make sense or not, I have convinced myself that the model is

exp_post

This is a bit weird I think cause now it looks like there is a learning effect for nonadjacencies and a negative effect for adjacencies. I can see how this makes sense: for adjacencies it means seeing a cue dot and a random one gives participants more help to rule out incorrect dots that just sseing the cue dot. This would also explain the effect for nonadjacencies, I think.

Mark-Torrance commented 4 months ago

It makes about as little sense as it's possible to make :)

All values are below what we be expected by chance.
Baseline performance should correlate with performance for dependencies. So the crossover makes no sense.
Adjacent learning can't be harder than non-adjacent, surely.
Lower than baseline performance for either of the dependency conditions is difficult to explain.

What are the observed cumulative frequencies?

jensroes commented 4 months ago

I can see how this makes sense: for adjacencies it means seeing a cue dot and a random one gives participants more help to rule out incorrect dots that just sseing the cue dot. This would also explain the effect for nonadjacencies,

I don't think the observed data help a lot. If I visualise the data by participant it looks like there is more diversity in the target looks for adjacent dependencies compared to non adjacent dependencies.

by_ppt

And a boxplot for the last occurrences: boxplot

Mark-Torrance commented 4 months ago

I can see how this makes sense: for adjacencies it means seeing a cue dot and a random one gives participants more help to rule out incorrect dots that just sseing the cue dot. This would also explain the effect for nonadjacencies,

Can you unpack that (ideally with an example)? Are you explaining baseline or dependency learning?

Mark-Torrance commented 4 months ago

So the observed cumulative means follow the same patterns as the modelled means, but values look more appropriate. It should lie somewhere between 40 4 1/16 = 10 and 40 4 1/8 = 20. Though closer to 10 than 20 perhaps.

jensroes commented 4 months ago

So the observed cumulative means follow the same patterns as the modelled means, but values look more appropriate.

I can easily model the means for the last occurrence instead which would be more similar to the observed ones.

Can you unpack that (ideally with an example)? Are you explaining baseline or dependency learning?

Both I think. I was thinking about this and a lot of my thoughts feel like I'm trying to find post hoc explanations. I think the better question here would be if this pattern can be found in Exp 2 as well. If not, we don't need to try to explain it really. I'll check this.

Say we have sequences ABR and ARC where R is random, A is the cue dot and B is the adjacent target and C is the non-adjacent target. If A marks the beginning of a sequence (which is something participants would need to have identified), the third position has always a higher chance to be predicted correct, cause AB and AC just lit up and possibly the same for other more recent dots. Baseline is earlier in nonadjacent dependencies than in adjacent dependencies which would make it easier to predict the correct dot. This might also explain the advantage for nonadjacent dependencies compared to their baseline. I think what I can't explain is the low value for adjacent dependencies.

jensroes commented 4 months ago

I think the better question here would be if this pattern can be found in Exp 2 as well.

We do find the same pattern in Exp 2 with a slightly smaller difference for adjacent dependencies.

jensroes commented 4 months ago

Ah, the difference in the results comes from zero inflation which shows a better fit as well, so I guess this is dealing with participants who didn't show learning. I'm really not proud of this way of selecting a model but we're back where we started from, at a zero-inflated negative binomial model.

These are the results from experiment 1 from a zero inflated negative binomial model:

zinb_results

jensroes commented 4 months ago

I think this makes sense. The results above replicated in both blocked presentation experiments and the learning effect disappears for both dependency types in mixed presentation.I'll update my version of the overleaf paper for now.

jensroes commented 4 months ago

This would be the new results with experiments reordered according to your suggestion.

cellmeans.pdf

Edit: I made a mistake with the middle panel but I'll fix it. Sorry, I keep posting premature results.

Mark-Torrance commented 4 months ago

(on phone and poor internet) these are observed or modelled? Either way, I think we have a much clearer story now?

Sent from Outlook for Androidhttps://aka.ms/AAb9ysg

From: Jens Roeser @.> Sent: Wednesday, July 17, 2024 11:58:15 AM To: jensroes/nonadjacent-sequence-learning @.> Cc: Torrance, Mark @.>; Mention @.> Subject: Re: [jensroes/nonadjacent-sequence-learning] Current version of paper on overleaf (Issue #13)

This would be the new results with experiments reordered according to your suggestion.

cellmeans.pdfhttps://github.com/user-attachments/files/16264333/cellmeans.pdf

— Reply to this email directly, view it on GitHubhttps://github.com/jensroes/nonadjacent-sequence-learning/issues/13#issuecomment-2233025621, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAYTXUFEDKLSDBDQRDSVE4LZMZE4PAVCNFSM6AAAAABKSPI22GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTGAZDKNRSGE. You are receiving this because you were mentioned.Message ID: @.***>

DISCLAIMER: This email is intended solely for the addressee. It may contain private and confidential information. If you are not the intended addressee, please take no action based on it nor show a copy to anyone. In this case, please reply to this email to highlight the error. Opinions and information in this email that do not relate to the official business of Nottingham Trent University shall be understood as neither given nor endorsed by the University. Nottingham Trent University has taken steps to ensure that this email and any attachments are virus-free, but we do advise that the recipient should check that the email and its attachments are actually virus free. This is in keeping with good computing practice.

jensroes commented 4 months ago

Modelled of course :) I do think the story is clearer now. I'll keep working on my version of the manuscript and create a results section for what was exp 2 and 3 (explicit instructions) and another for exp 1 (implicit instruction), so what you suggest we should do.

Mark-Torrance commented 4 months ago

Me suggestion was also that we report each experiment separately, which I still think is clearer. I've started editing along those lines. If that's what we're doing, then separate plots for each expt, then pooled analysis for the two explicit instruction exps.

jensroes commented 4 months ago

I started including your changes in my version of the overleaf paper and I think it would be good if we could keep working on one document.

I'm presenting Experiment 1 and 2 (former 2 and 3) in one section and Experiment 3 (former 1) in a separate section. This made sense to me because I thought you said it would be better to reduce the relevance of the instructions factor (agreed). I think reporting experiments 1 and 2 (and pooled analysis) in one section and experiment 3 in separate section allows us to report the results more concisely, reduces repetition and clutter. (this was for some reasons what I remember from what you said)

Also I think all results in one plot makes it easier to compare across studies (as a reviewer I'd want to compare results across studies which is difficult when there are 3 plots on different pages).

jensroes / nonadjacent-sequence-learning

Current version of paper on overleaf #13