Closed sathvikn closed 9 months ago
I've implemented it so that in the config file we can insert "/" around the text that represents the critical region. I opted for this more manual approach over an approach like defining which nodes are part of the critical region since it is more extensible and we may want to reuse nodes without defining multiple critical regions.
That sounds fine for now. also I'm thinking the easiest way might just be to have two fields in the JSON that say the critical region for the filler & gap? It's not the most urgent but we can keep this open until we figure it out.
Lan et al compute their effect size by taking the difference in surprisals over a critical region of grammatical and ungrammatical sentences with & without fillers, and then seeing if the difference of Delta-filler and Delta+filler is greater than zero. It might be sufficient to use the last word for the replication here, but specifying the indices of the start & end of the critical regions could be useful for other syntactic structures we end up testing later.
Acceptance Criteria: [] two columns in the generated CSVs indicating the start & end of the region where we compute surprisal.