broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 88 forks source link

Feature request: be able to upload files (e.g. pdf) to specific projects/samples #1760

Open jxchong opened 3 years ago

jxchong commented 3 years ago

We'd like to be able to upload attachments/files (at least pdf, maybe images too although worst case those can be converted to pdf) and link the files to a Project and/or on the Individual level. The alternative would be operating a second database in parallel to seqr to store this information.

Some example use cases of each:

Project:

Individual:

hanars commented 3 years ago

One important thing to note is that seqr is not legally allowed to store any PHI/PII. I would worry about adding MRI images as often times medical images have embedded metadata with PHI in them, even if the image itself doesn't look like it has anything. Similarly, I'm not sure we can support pictures of phenotypic features, as faces could be considered identifiable. @anneodonnell I feel like we have had the conversation on photos before - is that what we have decided in the past? Or did we figure out a way to support pictures?

jxchong commented 3 years ago

FWIW we handled this when using our prior system by having our staff screen the uploads and remove those that shouldn't be uploaded.

jxchong commented 3 years ago

There are non-face photos of phenotypic features (e.g. hands and feet, back, x-rays) that are valuable too

hanars commented 3 years ago

FWIW we handled this when using our prior system by having our staff screen the uploads and remove those that shouldn't be uploaded.

seqr doesn't have the bandwidth to do that, so we would need to instead rely on the user not to upload inappropriate information in the first place. To be clear, I am willing to add this functionality with some warnings telling users not to upload PHI, I just wanted to let you know that there will probably be language on the form explicitly banning medical images

hanars commented 3 years ago

I am closing out an old issue we had to track this request so we can track this work in just one place: https://github.com/broadinstitute/seqr/issues/195 Description from the original request:

I can see how this could be misused but I think if we had a warning at the time of attachment saying no names, identifiers, or PHI should be on any uploads. The types of upload I was thinking about are important PDFs for relevant papers and at the moment would be handy to be able to add screenshots from the TGG viewer for CNV and RNAseq data - though as these hopefully eventually get built into seqr, there would be less need for them. We'd probably want to put some file size limit so people don't store large in seqr. I also know you have a lot of development work going on so this isn't essential.