episphere / connect

Connect API for DCEG's Cohort Study
10 stars 5 forks source link

Mod1 QC issue: Required Questions Somehow Skipped #1110

Open KELSEYDOWLING7 opened 1 year ago

KELSEYDOWLING7 commented 1 year ago

These 7 participants skipped required questions in Module 1. I confirmed they're marked as having completed Mod1 and have a completion time stamp ranging from 10/5/22-4/1/23 and time to complete survey ranging from 12 minutes to 58 minutes.

Connect IDs:

3501714293 - AGECOR skipped 6682783044 - AGECOR skipped 3501714293 - SEX skipped 5166459272 - SEX skipped 6006097239 - SEX skipped 6682783044 - SEX skipped 9406308017 -SEX skipped

cusackjm commented 1 year ago

@KELSEYDOWLING7 can you see how they responded to questions that use SEX responses in the skip logic? E.g., SEX2, MHGROUP8, MHGROUP9, HAIRFEM, HAIRMALE.

KELSEYDOWLING7 commented 1 year ago

@cusackjm Here are the responses:

- 3501714293 Took version 2 @anthonypetersen This participant has 189 entries in Module 1 version 2, a few of which have answers to various questions. Could they have kept clicking start and exiting out or something? Can we delete the entirely empty entries? SEX2: all entries NA MHGROUP8: 535003378 (in 169th entry) MHGROUP9: 535003378 (in 51st entry) HAIRFEM: all entries NA HAIRMALE: 927925620 (in 149th entry)

- 5166459272 Took version 2 SEX2: NA MHGROUP8: NA MHGROUP9: NA HAIRFEM: NA HAIRMALE: NA

- 6006097239 Took version 2 SEX2: NA MHGROUP8: NA MHGROUP9: NA HAIRFEM: NA HAIRMALE: NA

- 6682783044 Took version 2 SEX2: NA MHGROUP8: 535003378 MHGROUP9: 797189152 HAIRFEM: NA HAIRMALE: NA

- 9406308017 Took version 2 SEX2: NA MHGROUP8: NA MHGROUP9: NA HAIRFEM: NA HAIRMALE: NA

woodruffr commented 1 year ago

I'm a little confused as to why this is assigned to us. On March 14th we emailed NCI with testing results. These questions are required when this is tested in the renderer. This is not an issue for the QC Rules. Our question was whether the questions were being answered but not having the responses stored. We can discuss on our regular Wednesday call if we want to add a check at the beginning of any module that references AGECOR or SEX and if the data is missing, then ask the questions.

danielruss commented 1 year ago

@KELSEYDOWLING7 I tried the modules and could not skip AGECOR (D_479353866). Out of curiosity, did they answer D_150344905 and D_783167257? Is it ok to look in the database? @anthonypetersen The Quest Tree is saved in Firestore. We could see the route they travelled in the questionnaire, if that is not against the IRB rules.

KELSEYDOWLING7 commented 1 year ago

@danielruss 5166459272 has a response for those two questions, but 6006097239 & 6682783044 & 9406308017 do not

3501714293 has an answer for D_783167257 in one of 189 entries but no answer for D_150344905

danielruss commented 1 year ago

ok this is weird.
if 5166459272 answer both, (s)he MUST have answer NO to AGECOR. if 3501714293 answered D_783167257, this agrees with answering YES to AGECOR.

The other two could have answered yes and skipped D_783167257. Again, if I could peek at the tree, I could answer tell you if they saw the question.

anthonypetersen commented 1 year ago

@danielruss correct, we store it there.. Couldn't tell you about whether the IRB allows looking at it or not.

cusackjm commented 1 year ago

I'll ask Amelia to see if she knows if the IRB allows that

cusackjm commented 1 year ago

@danielruss @anthonypetersen you are allowed to look at the raw data in Firestore to look at how the participant responded to those questions. Just no sharing/screenshots of PII

danielruss commented 1 year ago

@anthonypetersen Why does one person have so many documents associated with there module 1?
module 1 v_2 for ID: 3501714293 appears to have 1 document for every QUESTION! This is SUPER inefficient. This may explain why Connect backing up BILLIONS of documents. This is the over 50% of the the total GCP cost of connect.

anthonypetersen commented 1 year ago

@danielruss not that it makes it better, but there are only a handful of participants this has happened with. we've already shown that the reason we are backing up so many documents is because we back up our entire participants table every hour

danielruss commented 1 year ago

@KELSEYDOWLING7 @anthonypetersen I took a look at the path 5166459272 took through the module... OMG that person must HATE us. They kept getting sent to the start of the module. I don't think I could reproduce this. The other option is that the PWA did not write the tree appropriately.

danielruss commented 1 year ago

Presentation1

danielruss commented 1 year ago

6682783044 Also took a weird route, (INTROM1 -> INTROBAC -> MIDDLE_OF_RACE -> CANCER) everything ok after cancer.

KELSEYDOWLING7 commented 1 year ago

@danielruss Oh wow... it looks like 5166459272 started and finished on Saturday 12/31/22 and 6682783044 on 10/5/22. But I don't see any module 1 change pushed to prod on either of those days that would've caused a crash for them. Weird

KELSEYDOWLING7 commented 1 year ago

Another participant was somehow able to skip both AGECOR and SEX- Connect ID 6789382906

KELSEYDOWLING7 commented 1 year ago

3 more participants were able to skip AGECOR- 4117995761 6782817889 9898079683

KELSEYDOWLING7 commented 11 months ago

@FrogGirl1123 @anthonypetersen @cusackjm Following up on this issue. There continue to be new participants added to the QC list where we don't have a saved value for the required questions SEX and/or AGECOR.

I did a deep dive on these participants (Overview in Box: https://nih.app.box.com/file/1363355677650), and it looks like the skipped age or skipped sex, while not showing up on BQ, is somehow saving on their end. I'm assuming this is because they haven't cleared their cache? Either way, those who don't have a saved answer for SEX in Mod1 are still seeing the MENSHIS and PREG sections of Mod2 (for those who are clearly female by their first name in the user profile). And those who don't have a saved answer for AGECOR do still see the appropriate sections for LIFE variables in age3.

I'm not quite sure what to make of this, or if it could become a problem if it stops saving on their end AND we don't have a SEX or AGECOR response.

KELSEYDOWLING7 commented 11 months ago

@anthonypetersen @danielruss Any thoughts on this one? There's concern that because these two variables are pulled in many current and future surveys, this could cause a lot of issues down the line.

danielruss commented 11 months ago

@KELSEYDOWLING7 I dont have access to the box document

KELSEYDOWLING7 commented 11 months ago

@danielruss Oh sorry I don't have permission to add anyone to the box folder but here's a copy of the file Participants with Required Questions SEX and_or AGECOR NA or Not Saved.docx

danielruss commented 11 months ago

@KELSEYDOWLING7 I looked in the data and see results for AGECOR for 4117995761, 6782817889, 9898079683 Please call via teams.

danielruss commented 11 months ago

@anthonypetersen I looked in the data for participant 6682783044 and AGECOR is not in the firestore. However, Kelsey points out the the skip patterns are consistent with having made a selection. I think we need to look into whether or not the firestore API call is returning an error.

KELSEYDOWLING7 commented 10 months ago

@danielruss @anthonypetersen Good morning, is there ay update on this one?

anthonypetersen commented 10 months ago

I don't have any updates at the moment

KELSEYDOWLING7 commented 10 months ago

@anthonypetersen @danielruss Ok, do we have a rough ETA? We'd like to get this completed with the January prod push

anthonypetersen commented 10 months ago

With everything we already have planned, I don't see this getting completed this month.

KELSEYDOWLING7 commented 10 months ago

@anthonypetersen Ok thanks for the update I'll let Nicole know

KELSEYDOWLING7 commented 10 months ago

@cusackjm @FrogGirl1123

Daniel was able to look into the module tree for three of these participants to see the order of questions, and there's a lot of very odd and possibly concerning scenarios. Not quite sure how to proceed and if some of them should retake the survey. This also seems to be bigger then just a data saving issue.

CID 6682783044 saw: INTROMOD1 INTROBAC RACEETH Somehow AGECOR and MARITAL were never saved in the tree. Which I believe means that they never saw the questions. They also somehow did not see the SEX question (according the the tree). They did their survey the same day on Wednesday October 5,2022 around 7:40pm. Idk if we had any prod pushes going out then that would've affected their survey

The order 5166459272 saw makes no sense, @danielruss can speak more to this, but this is the participant we suggest having to retake the survey. This participant did theirs 12/31/22 for about an hour so I'm curious if we had an end of the year push at that time thinking no one would be taking it

9406308017 seems to have going back and forth from INTROBAC then all the way to the end, back to INTROBAC only 2 questions answered. But yet they took 40 minutes to complete the survey. Their timestamps were on 1/19/23, so the dates range a bit

KELSEYDOWLING7 commented 6 months ago

@danielruss @anthonypetersen Would we have time to look into fixing this for the may release? We had 6 people that have a missing sex at birth with a submitted Module 1 survey this month alone, which is the highest jump we've ever had in such a short amount of time. And for reference, there are only 17 that are missing sex at birth total. @m-j-horner @cusackjm @FrogGirl1123