episphere / connect

Connect API for DCEG's Cohort Study
10 stars 5 forks source link

Data Destruction request not setting 451953807 'Any Refusal or Withdrawal' #1117

Closed JoeArmani closed 1 week ago

JoeArmani commented 1 month ago

We came across an issue today based on a Gitter request where the Any Refusal or Withdrawal variable (CID 451953807) wasn't set on a participant where the following variables were set to yes:

•withdrewConsent (747006172) = yes •revokeHIPAA (773707518) = yes •dataDestroyed (831041022) = yes

•Participant variable 451953807 was null for this case. •Revocation was signed recently.

I believe 'Any Refusal or Withdrawal' is supposed to be set to 'yes' when any type of refusal or withdrawal request is made, including data destruction requests.

Impact: This impacts how the getParticipants API -> 'refusalsWithdrawals' option works because it forces sites to specify an exact type of refusal or withdrawal to fetch data destruction participants. When 'any refusal or withdrawal' is not set, participants will not show for general any refusal or withdrawal queries.

Note: I haven't investigated whether this stems from actions in SMDB or the PWA.

brotzmanmj commented 1 month ago

HI @JoeArmani You're correct, the 'Any Refusal or Withdrawal' variable (CID 451953807) is a derived variable and should derive upon action in the SMDB when NORC enters refusals, withdrawals, and data destructions requests.

JoeArmani commented 1 month ago

Hi @brotzmanmj thank you for confirming

brotzmanmj commented 1 month ago

@JoeArmani reading the gitter thread, it sounded like the issue KPCO raised was resolved by a change of syntax in their call. Is the above still an issue?

JoeArmani commented 1 month ago

@brotzmanmj I believe it is still an issue because that 'any refusal or withdrawal' variable wasn't set for this participant when 831041022 was set to 'yes'. We did use a syntax workaround to resolve the Gitter issue for that site, so I think this is a bug but possibly not a pressing issue.

API query syntax: getParticipants?type=refusalswithdrawals should return all refusals and withdrawals, including those where 831041022 is 'yes'. When 451953807 is not set, those participants are invisible to this query (I think sites are expecting all refusal/withdrawal results here).

getParticipants?type=refusalswithdrawals&option=dataDestroyed This more specific query successfully fetches only the 831041022 = 'yes' participants in the meantime.

Additional note: I'm not sure of the impact of the 'any refusals or withdrawals' variable to the analytics team.

brotzmanmj commented 1 month ago

OK thanks, so we do need to fix whatever is happening with the derived variable not always deriving. The specific query you mentioned will only return those with data destruction requested. The sites will still miss the rest of the refusals and withdrawals if they just use that.

I will add this issue with the derived variable to the list for the October release since this impacts the site's work in prod. We should craft a reply on gitter and perhaps also in the weekly update to inform the sites that the query might underreport cases until we are able to get this fixed at the end of October.

As for the impact on analytics, @KELSEYDOWLING7 can you check if your code for either the operations weekly report or the refusal/withdrawal report utilize the the 'Any Refusal or Withdrawal' variable (CID 451953807)? See above for details. Thanks.

KELSEYDOWLING7 commented 1 month ago

@brotzmanmj Yes, Table 6.2 on the Operations report is dependent on 451953807 set to yes. The refusal/withdrawal files do not need it, instead it pulls anyone that has refused all future activities, revoked HIPAA, withdrew consent, or requested data destruction.

mnataraj92 commented 1 month ago

There are a couple of additional variables that should be added to the 'participation status' (d_912301837) and 'any refusal or withdrawal' (d_451953807) derivations:

Refused 2024 Connect Experience Survey (HdRef_2024ConExpSrv_v1r0; CID d_685002411_d_101763809) Refused All future Connect Experience surveys (HdRef_AllFutConExpSrv_v1r0; CID d_685002411_d_525277409) Refused QOL survey 3-mo (HdRef_3moQOLsurv_v1r0; CID d_685002411_d_936015433) Refused all future QOL surveys (HdRef_AllQOLsurv_v1r0; CID d_685002411_d_688142378)

brotzmanmj commented 1 month ago

@sonyekere this will be in the October release?

sonyekere commented 1 month ago

@brotzmanmj Yes!

JoeArmani commented 1 month ago

I'm adding this query so we can track it, because I think it's associated with the issue. There are currently 50 cases where: d_451953807 = null d_831041022 = 353358909

SELECT Connect_ID, d_451953807, d_831041022, d_269050420 FROM `<participants>` 
WHERE d_831041022 = 353358909
  AND (d_451953807 != 353358909 OR d_451953807 IS NULL)

LIMIT 1000

Then, there are 50+ cases where the results are as expected d_451953807 = 353358909 d_831041022 = 353358909

SELECT Connect_ID, d_451953807, d_831041022, d_269050420 FROM `<participants>` 
WHERE d_831041022 = 353358909
   AND d_451953807 = 353358909

LIMIT 1000
brotzmanmj commented 1 month ago

@mnataraj92 and @KELSEYDOWLING7 can you add to the weekly automated QC checks to check the derived variable d_451953807 (any refusal or withdrawal) to make sure it is derived = yes for all conditions where it should be yes? Please also check the Participation Status variable CID 912301837 if that's not already being done in the QC. Thanks.

KELSEYDOWLING7 commented 1 month ago

For the participant status, we have these rules, : If HdWd_Destroydata_v1r0 = yes then participation status = destroyed data If HdWd_DataHasBnDestroyed_v1r0 = yes then participation status = data destroyed If HdWd_Deceased_v1r0 = yes  then participation status = deceased (as well as the reverse) If participation status = destroyed data then HdWd_Destroydata_v1r0 = yes If participation status = data destroyed then HdWd_DataHasBnDestroyed_v1r0 = yes If participation status = deceased then HdWd_Deceased_v1r0 = yes

@mnataraj92 @jacobmpeters I've added 3 single crossvalid1 rules to the 'Rules to Add' section. The rules are: if withdrewConsent (d_747006172) = yes, Any Refusal or Withdrawal (d_451953807) = yes if revokeHIPAA (d_773707518) = yes, Any Refusal or Withdrawal (d_451953807) = yes if dataDestroyed (d_831041022) = yes, Any Refusal or Withdrawal (d_451953807) = yes

jhflorey commented 1 month ago

@JoeArmani @mnataraj92 my changes are ready in dev. We can plan to test it.

mnataraj92 commented 1 month ago

Hi @jhflorey were you able to incorporate these variables as well from my earlier comment:

There are a couple of additional variables that should be added to the 'participation status' (d_912301837) and 'any refusal or withdrawal' (d_451953807) derivations:

Refused 2024 Connect Experience Survey (HdRef_2024ConExpSrv_v1r0; CID d_685002411_d_101763809) Refused All future Connect Experience surveys (HdRef_AllFutConExpSrv_v1r0; CID d_685002411_d_525277409) Refused QOL survey 3-mo (HdRef_3moQOLsurv_v1r0; CID d_685002411_d_936015433) Refused all future QOL surveys (HdRef_AllQOLsurv_v1r0; CID d_685002411_d_688142378)

If any of the above variables are yes then d_451953807 (any refusal or withdrawal) = yes (353358909) Also, if any of the above variables are yes then d_912301837 (participation status) = refused some activities (622008261)

Have these been incorporated as well?

I'm going to work on a test plan that encompasses various scenarios/combinations to ensure that these derived variables are working as expected, and Michelle is going to take a look once I have that. When that is ready I'll let you know and we can test in dev.

jhflorey commented 3 weeks ago

@mnataraj92 622008261 is the new conceptID? if data['726389747'] === 353358909 or data['747006172'] === 353358909 or data['773707518'] === 353358909 or data['831041022'] === 353358909 or data['906417725'] === 353358909 or data['987563196'] === 353358909 => then d_912301837 = 622008261 ?

brotzmanmj commented 3 weeks ago

@JoeArmani @mnataraj92 my changes are ready in dev. We can plan to test it.

@jhflorey you identified why the data destruction was not being captured in the derived variable d_451953807 (any refusal or withdrawal) and that is in dev? Can you tell us what was broken and how it was fixed so we will better understand how to test it?

mnataraj92 commented 3 weeks ago

@mnataraj92 622008261 is the new conceptID? if data['726389747'] === 353358909 or data['747006172'] === 353358909 or data['773707518'] === 353358909 or data['831041022'] === 353358909 or data['906417725'] === 353358909 or data['987563196'] === 353358909 => then d_912301837 = 622008261 ?

Hi Jessica, this isn't a new concept ID and this response has worked before so I'm not sure why the logic is incomplete.

Participant status (d_912301837) should = "refused some activities" if any of the refusals (or a combination) = yes, but the withdrawal variables (withdraw consent, refusing all future activities, revoking HIPAA, destroy data requested, data destroyed, and deceased = no).

Do you have the complete code to set 912301837= 622008261?

jhflorey commented 3 weeks ago

@mnataraj92 622008261 is the new conceptID? if data['726389747'] === 353358909 or data['747006172'] === 353358909 or data['773707518'] === 353358909 or data['831041022'] === 353358909 or data['906417725'] === 353358909 or data['987563196'] === 353358909 => then d_912301837 = 622008261 ?

Hi Jessica, this isn't a new concept ID and this response has worked before so I'm not sure why the logic is incomplete.

Participant status (d_912301837) should = "refused some activities" if any of the refusals (or a combination) = yes, but the withdrawal variables (withdraw consent, refusing all future activities, revoking HIPAA, destroy data requested, data destroyed, and deceased = no).

Do you have the complete code to set 912301837= 622008261?

i found the code to set 912301837= 622008261 from SMDB. So it will work as before. We can plan to test it now.

jhflorey commented 3 weeks ago

@JoeArmani @mnataraj92 my changes are ready in dev. We can plan to test it.

@jhflorey you identified why the data destruction was not being captured in the derived variable d_451953807 (any refusal or withdrawal) and that is in dev? Can you tell us what was broken and how it was fixed so we will better understand how to test it?

@brotzmanmj because 451953807 is not in the list of variables kept when running data destruction. I have added it into the retained list. The code change is here https://github.com/episphere/connectFaas/pull/665/files#diff-0d8ec4ec095b077565be32aa751d82e00f80837c8c53aeb2d26e37ffe2504481R421

brotzmanmj commented 3 weeks ago

Thanks @jhflorey for the explanation.

@mnataraj92 please check the Data Destruction SOP and add 451953807 (Any refusal or withdrawal) to the stub record list.

rohanjay10 commented 3 weeks ago

This is ready for stage testing

jhflorey commented 1 week ago

The script for restoring the ‘any refusal or withdrawal’ variable to ‘Yes’ for those individuals whose values were mistakenly deleted during the data destruction process.

https://gist.github.com/jhflorey/18ffd694edfbf4ce3992a5fba8699406

jhflorey commented 1 week ago

The result of running the script in dev

image
jhflorey commented 1 week ago

The result of running the script in stage

image
mnataraj92 commented 1 week ago

Tested in stage, ready for prod!

Summarizing here that the changes made were:

If any of the following variables are yes then d_451953807 (any refusal or withdrawal) = yes (353358909) and d_912301837 (participation status) = refused some activities (622008261)

Refused 2024 Connect Experience Survey (HdRef_2024ConExpSrv_v1r0; CID d_685002411_d_101763809) Refused All future Connect Experience surveys (HdRef_AllFutConExpSrv_v1r0; CID d_685002411_d_525277409) Refused QOL survey 3-mo (HdRef_3moQOLsurv_v1r0; CID d_685002411_d_936015433) Refused all future QOL surveys (HdRef_AllQOLsurv_v1r0; CID d_685002411_d_688142378)

d_451953807 is now being retained as a stub record variable, and = yes when any refusal or withdrawal variable = yes (including revoked HIPAA, withdrew consent, data destruction requested, data destroyed).

jhflorey commented 1 week ago

The result of running the script in prod

image
jacobmpeters commented 3 days ago

The QC Rules for this issue are implemented in our automated Recruitment QC Report. Madhuri and I just did a manual check as well and noted that these issues appear to be fixed in BQ.