Closed robertsamm closed 1 year ago
Let's extend this issue to cover the SMDB components of what we agreed to resolving for Data Destruction.
Required functionality for the SMDB
- Search for participants (allow sites to search
their own participants)
- Retain Query.##Name arrays for SMDB Search
- Do Not retain Query.emails/phone arrays
For these items
query.firstName
query.lastName
Connect_ID
token
state.studyId
699625233 - Autogenerated flag when User Profile submitted
query.allEmails
query.allPhoneNo
371067537 - dob
This requires addition of “Health care provider”
For this item we have to keep 827220437 - Health care provider
- Participant Summary Form
For this item we have to keep 130371375 - Payment Round
- All other variables/derivations will become “null” or “blank”, etc.
- Rows will still appear on the page but no longer populated
- If possible: set non-stub record rows to "Data deleted" When new destroyed? CID (to be added) is "yes"
As my understanding for these items that we will not remove non-stub-records. Instead we will update all to null
or blank
.
On the other hand, in PWA / SMDB will still show rows(non-stub-records) on the page but no longer populated it.
On firestore, add a field such as Data deleted
for participant that have been destroyed with value is 353358909 - yes
- Print forms for participants (Consent, HIPAA, HIPAA revocation, and Data Destruction)
. it seems to belong to PWA page
- Should only see data retained in stub record data
- This should use the “health care provider” for the version of consent/HIPAA
For these items, i do not really understand clearly. Please help explain it more. Thanks
This requires addition of “Health care provider”
For this item we have to keep827220437 - Health care provider
Yes, please add this to the Excel list of variables to be retained in the stub record
- Participant Summary Form
For this item we have to keep130371375 - Payment Round
Yes, please add this to the stub record variable list too
- Print forms for participants (Consent, HIPAA, HIPAA revocation, and Data Destruction)
. it seems to belong to PWA page
This functionality is on both the PWA and the SMDB
For the other items, it would probably be clearer if we looked at the Participant Summary page together on a brief call to walk through what is needed.
This is the list of stub records.
- Print forms for participants (Consent, HIPAA, HIPAA revocation, and Data Destruction)
. it seems to belong to PWA pageThis functionality is on both the PWA and the SMDB
Hi @brotzmanmj , this is what you mean in SMDB (we will disable Participant Withdrawal
for participants whose data has been destroyed)
Hi Jessica, Yes that's correct
@brotzmanmj i already added Health care provider
and Payment Round
. Also uploaded file to https://nih.app.box.com/file/1255095111396
Confirmed, thanks! I believe we also decided to retain the user profile name changes in the history so can you add those variables too?
Yes, It looks like the query.firstName and query.lastName should be included in the list mentioned above. This will enable proper searching in the SMDB.
@brotzmanmj @Davinkjohnson Already updated.
Adding a note about changes to the query.firstName and query.lastName data structures just to make sure there's no conflict moving forward.
Possibly related issue: https://github.com/episphere/connect/issues/654
Currently: They are written as strings - PWA signup writes string, PWA edit writes string, SMDBedit writes string.
New update (PR coming Monday 7/10): These will be sometimes strings and sometimes arrays - PWA signup writes as string, PWA edit writes as array, SMDB edit writes as array.
Future: These will be arrays. PWA will be adapted to write new signup fields query.firstName and query.lastName as arrays. Existing participant data will also be updated to arrays.
@JoeArmani thanks for your informations.
Based on the conversation from last week, here are the details we talked through and agreed to. (Note for Jessica, please hold back on the work in this issue, the SMDB components, until after all July release tasks have been complete.)
(For all data where fields/rows are requested to be "N/A" or "Data Destroyed" Firestore should contain no data for this, handle the missing data in the SMDB by displaying the requested text when new "Data Destroyed" CID = 353358909 - yes)
Participant Summary Page
Header Items
Details Section
Participant Details Page
It looks like the data is still in BQ after the destruction
@KELSEYDOWLING7 Can you confirm the following?
@Davinkjohnson Yes, the tables ran this morning and I also reran them before searching for this participant. Just reran them again to triple check.
Examples of data still in BQ for this participant:
I also found in the SOP the following requirement. @jhflorey this will also have to get added into the FaaS
Post Data Destruction: The Connect API to refuse future receipt of data on this participant from all data sources (surveys, EHR, etc.).
(Presumably this would be in both submitParticipantsData and updateParticipantData, where it would check for any data update NOT from the SMDB, if the participant has the data destroyed flag = yes then return some error code.)
@Davinkjohnson @KELSEYDOWLING7 already fixed as PR https://github.com/episphere/connectFaas/pull/395. We can try again in dev.
Thanks @jhflorey . Kaitlyn is going to submit data destruction for a couple of records this afternoon. We'll let you know which Connect IDs and what data they have beforehand.
The two Connect ID's we have chosen to submit data destruction for are: 3994600604 (has modules 1,3,4 submitted and module 2 started but not submitted, menstrual survey, covid survey, SSN, research blood, urine, and mouthwash submitted) and 3231286166 (has modules 1-4, covid survey, SSN, clinical blood and urine submitted)
Good morning, after the data destruction at 1am, all the raw table queries ran at 4:30am and again at 10:30am, all flattened table queries ran at 9:30am and again manually at 11am, I am still seeing data for all of these modules/surveys/specimens/shipments for both participants .
The only one I am unsure of is the menstrual cycle survey for 3994600604 because there are 2 records, but they both have today's date listed under the date variable. One has data and one does not.
@KELSEYDOWLING7 Do we have log after running after 1am?
@jhflorey Sorry, I'm not sure what you mean by a log
It looks like the image below after running the table queries
Hi @KELSEYDOWLING7 could you show me how to do all the raw table queries ran
?
Is that log on Firestore or BQ? It's not something I'm familiar with.
I just run the queries in BQ from the scheduled queries tab (the magnifying glass on the left-hand side of BQ, then select scheduled queries. The raw tables start with participants and the flattened tables start with FlatConnect
@jhflorey have you confirmed that all sources of data including the survey module data were entirely deleted from Firestore? Kelsey is looking at BQ, not at Firestore.
@brotzmanmj @KELSEYDOWLING7 just discussed with Davin. My bad, i will update code now.
@brotzmanmj Jessica has to update the code to remove the data from all the other collections/tables. (there was a misunderstanding on the requirement here.) However, when we were reviewing tables to remove data from she asked about the boxes table. Did we decide whether that data should be destroyed? (there's no data in it that would tie back to a specific participant once the other data are destroyed. but it is technically their data.)
@brotzmanmj after reviewing we listed these collections bioSurvey_v1, clinicalBioSurvey_v1, covid19Survey_v1, menstrualSurvey_v1, module1_v1 module1_v2, module2_v1, module2_v2, module3_v1, module4_v1, ssn, biospecimen
.
We will delete the document of collection if connect_ID is equal to the connection_ID of the participant whose data has been destroyed.
Please correct me if i was wrong or missing any collection.
@Davinkjohnson I was wondering about the Boxes data that included their samples as that gets a little complicated. If we don't delete the data from the Boxes table, will that mess anything up as the tubes are associated with the box they were shipped in? If we do delete their data from the Boxes, what happens if those were the only tubes in the Box vs a Box with other people's tubes in it? That's why we tested one of each type, to see what would happen. We might need guidance on this from Nicole.
@jhflorey I think that encompasses all the tables of data that exist and the different survey versions but @KELSEYDOWLING7 can you also confirm?
@brotzmanmj Yes, those are all the tables we have for now for surveys and specimens
@brotzmanmj @Davinkjohnson @KELSEYDOWLING7 i'll update code for deleting bioSurvey_v1, clinicalBioSurvey_v1, covid19Survey_v1, menstrualSurvey_v1, module1_v1 module1_v2, module2_v1, module2_v2, module3_v1, module4_v1, ssn, biospecimen
now. And the boxes table, i will work on it once we have a final decision.
@jhflorey The decision has been made to KEEP the boxes table data since it will not directly link back to the participant and the data would be messy to attempt to delete. Please proceed with deleting all the other above called out data for each data destroyed participant.
@KELSEYDOWLING7 i just tested with my code changes in local. Could you help check participant with conceptID = 3994600604
@jhflorey Yes, thank you. It looks like the raw tables haven't refreshed yet, and I don't believe that's something our team can manually push. So I will check first thing Monday morning
@jhflorey @Davinkjohnson The data is deleted! None in mods 1-4, BUM, Covid, or Menstrual Surveys. The SSN survey flag is null, as is the flags for a partial or full social.
@KELSEYDOWLING7 Does it mean it meets our expectations?
@jhflorey Yes, the data has been destroyed
@jhflorey Will you be testing the code change with the second participant as well (Connect_ID=3231286166) ?
@KELSEYDOWLING7 not yet, i only try on 3994600604
. Would you like me to do the same thing for 3231286166
Reviewed stub record in stage for connectID 1475895409 and found a few remaining issues. After the stub record was pushed, it did the same thing in the SMDB where I can't view the Participant Summary page for this participant anymore, I'm not sure if that is intentional or not but I don't see it in the SOP. So looking at this record in the SMDB and the CIDs in the array Jessica sent me here's the few remaining issues I found