Open alvinwmtan opened 2 months ago
IDless import complete; ready for review
Checklist for code review v2024
To start:
Common issues to check:
Trials
Trial Types
Stimuli
Subjects
General
@alvinwmtan Is the raw data for kremlin on the peekbank osf? (because I'm not able to download it and not seeing it there?)
sorry, just uploaded. should be there now
Thanks! Might I also have "demo_comp.Rda"?
ah yes sorry, forgot i had to copy it over from sander-montant_2022
sorry to keep asking for files (@alvinwmtan), but I can't find target-distractor-pairs.csv trial_info_fr.csv trial_info_sp.csv and also the images if we have them
my bad; added them. there are wmv files that could be screenshotted to grab the images but i haven't done so—feel free to!
note also that trial_info_fr.csv and trial_info_sp.csv were constructed by me rather than from the original raw_data; it is possible but just somewhat annoying to programmatically pull together all the different pieces of info needed.
Images could be pulled via screenshot readme mentions that CDI data for the montreal subset exists (but has not been imported yet) (and DVAP -- another vocab measure)
@alvinwmtan I'm getting a validation error because of an aoi region that is all NA's -- it looks like this is coming from the fact that one of the datasets has aoi coordinates and the other doesn't (hand coded) and NAs get added to that one in a bind_rows? Does that sound right? / Do you have ideas for fixing?
this is correct (montreal has AOIs and princeton doesn't). somehow i didn't get a validation error when i ran it though? not sure why this is popping up
so the validation issue seems to not be about the NAs and more be about that there are multiple regions in the aoi_region_set, but there's only 1 in the trial_types? which I think I've traced back to
if(!is.na(data$l_x_max[[1]])){
trial_types$aoi_region_set_id <- 0
in the digest_data function at https://github.com/peekbank/peekbank-data-import/blob/33e750103cb1249b35daa01c6a7e9de1da5a6749/helper_functions/idless_draft.R#L284C1-L284C39.
Commenting out that line does seem to "fix" things, but I don't know what it's purpose is -- @adriansteffan what is this line supposed to be doing?
so the validation issue seems to not be about the NAs and more be about that there are multiple regions in the aoi_region_set, but there's only 1 in the trial_types? which I think I've traced back to
if(!is.na(data$l_x_max[[1]])){ trial_types$aoi_region_set_id <- 0
in the digest_data function at https://github.com/peekbank/peekbank-data-import/blob/33e750103cb1249b35daa01c6a7e9de1da5a6749/helper_functions/idless_draft.R#L284C1-L284C39.
Commenting out that line does seem to "fix" things, but I don't know what it's purpose is -- @adriansteffan what is this line supposed to be doing?
The line is supposed to remind me to read my code more carefully before committing. I removed it, thanks for the catch!
Kremin et al. (2021) has two data subsets: Eng–Fra bilingual and Eng–Spa bilingual. The former is included in montat_2022, but the latter is not; we will import this whole dataset separately.