genomic-medicine-sweden / gms-artic

A nextflow pipeline with a GMS touch for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics).
GNU Affero General Public License v3.0
9 stars 6 forks source link

Rename files before upload to AWS nodes #37

Open pbiology opened 2 years ago

pbiology commented 2 years ago

What needs to be done: In order to avoid issues with sensitive information being sent to cloud, we could do a pre-step with a local executor. If we re-name files locally, keep a dictionary in memory and then re-name the files again once they return from the cloud nodes, then we can avoid sending labIDs to the cloud.

Suggestions on how to get it done: Before the rest of the workflow is executed, run a step with a local executor for file renaming. One perhaps need to be a little careful not to use up all resources on the login node.

What are the arguments for getting it done: We can side-step some of the legal issues hopefully

Task is considered finished when: