broadinstitute / image-profiling-workflow-template

Data processing workflows for image-based profiling experiments
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Proposed Template Layout #2

Open gwaybio opened 4 years ago

gwaybio commented 4 years ago

We can discuss here what the template should include and its structure

gwaybio commented 4 years ago

Proposed structure:

profile-name_repository (e.g. 2015_10_05_DrugRepurposing_AravindSubramanian_GolubLab_Broad)
├── 0.download-data
├──── data
|   └──── raw_sqlite_profiles_downloaded_from_figshare
├──── download-data.ipynb (notebook that will download and describe the data)
├──── download.sh (bash script that will execute download data)
├──── README.md (instructions on how to execute download)
├──── upload-data.ipynb (example notebook, not to be executed, that stores how data was uploaded)
├── 1.process-profiles
├──── profiles
|   ├─── batch_1
|       ├─── plate_1
|       └─── plate_2
|   └─── batch_2
|       ├─── plate_1
|       └─── plate_2
├──── barcode_platemap.csv
├──── metadata
|   ├─── batch_1
|       ├─── plate_1
|           └─── platemap.csv
|       └─── plate_2
|           └─── platemap.csv
|   └─── batch_2
|       ├─── plate_1
|           └─── platemap.csv
|       └─── plate_2
|           └─── platemap.csv
├──── process-profiles.ipynb (notebook that will perform the standard processing pipeline, this can also create the directory structure above)
├──── process.sh (bash script that will execute profile processing)
├──── README.md (instructions on how to execute processing)
├── environment.yml (conda file for package versions)
├── config.yml (very important to define, will include constants (e.g. project name, batch names, other important config terms)
├── LICENSE.md (This may be an important consideration - should we include license in template? or should each project get their own?)
├── README.md (Overall project README describing the experiment)
gwaybio commented 4 years ago

wondering if we should also have a second folder 2.profiling-audit.

This will store plate layout analyses and replicate reproducibility backbones.