cogent3 / Cogent3Workshop

Materials for the Phylomania workshop
BSD 3-Clause "New" or "Revised" License
8 stars 4 forks source link

Flowchart wrangling and checking data #6

Closed khiron closed 11 months ago

khiron commented 11 months ago

image

khiron commented 11 months ago
graph LR
    QC[Quality\nControl]
    download[Download\nRaw Data]
    gb[Genbank]
    build[Build\nResource]
    online(Published\ndata set)
    gbid([IDs])
    QC --> wrangled((Wrangled & \nChecked!))
    gbid --> gb --> download
    online --> download
    online --> gbid
    download -->build
    ensembl(Ensembl)-->download
    build --> QC
KatherineCaley commented 11 months ago

richard-merman

khiron commented 11 months ago

I think we need to flesh out the process of QC with sample steps required to check data is consistent and appropriate for use and then show each point in this pipeline where cogent3 can be used.

eg:

graph LR
    gbid((Published Genbank IDs)) --> gb[Genbank] --> startQC 
    online((published online data set)) --> downloaded[downloaded data set] -->startQC
    ensembl((Ensembl))-->build-->sample-->startQC
    endQC --> analysis((Analysis))
    subgraph "Quality control"
    startQC[begin quality control] 
    endQC[wrangled & checked data]
    consistent{consistent?} -->|no|cogent3
    consistent -->|yes|checked
    checked{checked?} -->|no|cogent3
    checked -->|yes|wrangled
    wrangled{wrangled?} -->|no|cogent3
    wrangled -->|yes|endQC
    cogent3 -->endQC
    startQC --> consistent
    end
GavinHuttley commented 11 months ago

thanks @khiron !