FredHutch / wiki

SciWiki: Collective KnowledgeBase for Scientific Data and Use
https://sciwiki.fredhutch.org
Other
35 stars 43 forks source link

What is the best practice for long term storage of data containing PHI #479

Open Chilliwack opened 4 years ago

Chilliwack commented 4 years ago

Proposed Domain Specify where should the new content should go (Data Generation or Scientific Computing), including a recommended subsection.

Data Generation - https://sciwiki.fredhutch.org/generation/gen_index/

This came out of a discussion Lauren W was having with the Reid Lab.. Where to store data: What is the best practice for long term storage of data containing PHI?

Content Summary Summarize the topic you'd like to be added, or include a URL with content you would like to be expanded.

Again continuing discussion with the Reid Lab.. Major members of the lab are retiring so they are focusing on archiving their datasets. Where to store data: What is the best practice for long term storage of data containing PHI?

Local Content Expert(s) Suggest any Fred Hutch based experts who we might ask to contribute (GitHub ID is preferred, but name of someone and/or desired expertise is ok too).

Someone with experience and institutional knowledge of what is the best/support practices at FH for long term cold storage of data containing PHI? Can these best/supported practices be detailed in the sciwiki please for lab members to reference?

esilgard commented 4 years ago

I'll add that we may want to add in some guidelines of what "long term" means and expected need to access that data in the future

zyd14 commented 4 years ago

I think most PHI has to be stored for at least 6 years. This would definitely be of interest to my group. From my research it seems like cloud storage is probably the most secure, particularly with regards to redundancy, physical security and ease of encryption. Of course there are potentially large costs associated with that (particularly for larger datasets), although these can be mitigated to a degree by choosing particular managed reduced availability storage methods.

I'd be happy to contribute what I've learned so far but it'd be great to get someone from Hutch Data Governance or something along those lines to weigh in.

sgglick commented 4 years ago

Hi, there. Susan Glick from Data Governance.

Research data retention is guided by the federal funding agency, FH Policy or private funding source. If there is a conflict, select the longer retention period. The retention period begins after the research has been closed. Thus is someone wants to archive information while the research is still active, the retention period countdown does not trigger until the research is closed.

The retention of PHI for 6 years is for the purpose of providing an accounting of PHI uses/disclosures. A patient has the right to request an accounting of all the times their PHI has been used or disclosed ("released") for 6 years prior to their request. A "release" means, for example, a researcher has used the patient's data on a protocol with a Waiver of Consent. A "waiver of consent" means that the IRB as approved the use of the PHI in place of a patient signing a consent/authorization. For a researcher which uses a patient's PHI under the direct consent or authorization of that patient, no PHI "release" must be accounted for because the patient authorized the release.

Practically/Operationally speaking, for researching using PHI under waiver of consent, not only does the PHI need to be retained for six years, but it also needs to be searchable on a rolling 6 year retention period in case the patient submits a request for an accounting of their PHI usage. If the PHI is used for research with direct consent of the patient, the 6 year rolling retention does not apply. Glad to answer questions when I can.

zyd14 commented 4 years ago

That's super helpful, thanks!

I've been reading some seemingly conflicting information regarding how protected (if at all) deidentified samples and genomic data are. From my first reading through the Common Rule and some summaries / recommendations about it, my takeaway was that genomic data was considered inherently identifiable and fell under PHI / human subjects research. But then I found Attachment C, which appears to say that deidentified samples do not constitute human subjects research; of course there are strict definitions on what is 'deidentified'.

Would you be able to provide any clarification on this?

sgglick commented 4 years ago

Would this be a better discussion on the phone next week? I have seen our FH irb confirm for genomic data combined with de identifie patient records a a non human subjects research. On an hdc centernet page (which still exists) there is a page called data amd compliance and security. There is some relevant information there. Human subjecrs protection and phi are close but not the same thing.

Get Outlook for Androidhttps://aka.ms/ghei36


From: zyd14 notifications@github.com Sent: Saturday, May 2, 2020 1:46:29 PM To: FredHutch/wiki wiki@noreply.github.com Cc: Glick, Susan G sgglick@fredhutch.org; Comment comment@noreply.github.com Subject: Re: [FredHutch/wiki] What is the best practice for long term storage of data containing PHI (#479)

That's super helpful, thanks!

I've been reading some seemingly conflicting information regarding how protected (if at all) deidentified samples and genomic data are. From my first reading through the Common Rule and some summaries / recommendations about it, my takeaway was that genomic data was considered inherently identifiable and fell under PHI / human subjects research. But then I found Attachment Chttps://urldefense.proofpoint.com/v2/url?u=https-3A__www.hhs.gov_ohrp_sachrp-2Dcommittee_recommendations_attachment-2Dc-2Dfaqs-2Drecommendations-2Dand-2Dglossary-2Dinformed-2Dconsent-2Dand-2Dresearch-2Duse-2Dof-2Dbiospecimens-2Dand-2Dassociated-2Ddata_index.html&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=KTCdU8Gg4GvTbWQmD0tEn1Vp-FR378Q1SiTyAWxL5rk&m=hEo895ZS-wu7m71Tjp3MAnsHyUukMZg0A0XqpvHEeo4&s=0URjqm3MkblMI6KLZ8StRSwSAs_eIcWnzGhtFe97LSY&e=, which appears to say that deidentified samples do not constitute human subjects research; of course there are strict definitions on what is 'deidentified'.

Would you be able to provide any clarification on this?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_FredHutch_wiki_issues_479-23issuecomment-2D623011063&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=KTCdU8Gg4GvTbWQmD0tEn1Vp-FR378Q1SiTyAWxL5rk&m=hEo895ZS-wu7m71Tjp3MAnsHyUukMZg0A0XqpvHEeo4&s=PKCbLKqjj0AWYKcBmMyre4Wm2kqausxR0t8Juju1fLk&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AII4VFUISZA6CKP45GMMLFDRPSBCLANCNFSM4KKL4WXQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=KTCdU8Gg4GvTbWQmD0tEn1Vp-FR378Q1SiTyAWxL5rk&m=hEo895ZS-wu7m71Tjp3MAnsHyUukMZg0A0XqpvHEeo4&s=5ggNXf-aFOB0Vs6Win2ApSfc5o0AJMdYEjoOnbAMQ6M&e=.

laderast commented 2 months ago

Needs some rethinking because of post-merger issues - restructure