Open jbz2 opened 4 years ago
Goal is to have some rules about the data before call next week to provide to consortium. This will have a document/powerpoint - worked for Justin - we will have as part of it who to contact in DCC as folks get started.
Box data • Appears that Box is being used to share data for work-in-progress • For example, RNA Seq files uploaded by UNC • Issues : no pipeline or metadata • Therefore, should users be directed to these files or should they remain as they are now – i.e, to share of work in progress • Some data are duplicated – in PSC and Box. For example, Toshi and Ben are uploading to Box and PSC but not always in sync with what’s at PSC. For example, only some CN files are on Box but not clonality/Celluarity or variant calling. Similarly, Toshi’s files on Box is old. Therefore, not sure how to control what goes up on Box. • Strategy could be that we are responsible only for PSC and Box could be “use at your own discretion”
Create a readme file in Patient Directory for users Define what files can be found in box Define what files can be found in the PSC Perhaps a venn diagram illustrating file location How Dups are handled for file types What files are used in analysis