datacarpentry / cloud-genomics

Introduction to Cloud Computing for Genomics
https://datacarpentry.org/cloud-genomics/
Other
19 stars 49 forks source link

Data security issue when using cloud computing #73

Open jingcruk opened 5 years ago

jingcruk commented 5 years ago

Dear all,

I think it is important to mention that one data protection issue in the "disadvantage" section here:

https://datacarpentry.org/cloud-genomics/01-why-cloud-computing/index.html

People need to be aware that some genomics data, e.g. individual patients data, and large consortium databases including TCGA and ICGA, are protected for the privacy of those individuals involved. Anyone who wants to submit these types of data to be analyzed through Cloud Computing, need to check for the permission for the data security reason.

Thanks,

Jing

ACharbonneau commented 5 years ago

We should have some discussion of data restrictions, but I'm not sure it belongs in the disadvantage list. Amazon, for instance, has done a lot of work to get some of their services approved for use with restricted human data, and with the right settings can be used for dbGAP and similar. There's a lot of nuance in what platforms can be used, and for what kind of data, with which restrictions.

I think we would want a longer discussion of it here: https://datacarpentry.org/cloud-genomics/05-which-cloud/index.html with the caveat that there's no way we can cover every case.

nicjar commented 5 years ago

Here's a suggestion for an addition about human data and security: https://github.com/datacarpentry/cloud-genomics/pull/84