YeoLab / outrigger

Create a *de novo* alternative splicing database, validate splicing events, and quantify percent spliced-in (Psi) from RNA seq data
http://yeolab.github.io/outrigger/
BSD 3-Clause "New" or "Revised" License
61 stars 22 forks source link

Remember, flat is better than nested #79

Open olgabot opened 7 years ago

olgabot commented 7 years ago

Description

Right now, the outrigger output is a big nested mess:

$ tree outrigger_output
outrigger_output..........................................................index
├── index.................................................................index
│   ├── gtf...............................................................index
│   │   ├── gencode.vM10.annotation.gtf...................................index
│   │   ├── gencode.vM10.annotation.gtf.db................................index
│   │   └── novel_exons.gtf...............................................index
│   ├── exon_direction_junction_triples.csv...............................index
│   ├── mxe...............................................................index
│   │   ├── event.bed.....................................................index
│   │   ├── events.csv....................................................index
│   │   ├── exon1.bed.....................................................index
│   │   ├── exon2.bed.....................................................index
│   │   ├── exon3.bed.....................................................index
│   │   ├── exon4.bed.....................................................index
│   │   ├── intron.bed....................................................index
│   │   ├── splice_sites.csv...........................................validate
│   │   └── validated..................................................validate
│   │       └── events.csv.............................................validate
│   └── se................................................................index
│       ├── event.bed.....................................................index
│       ├── events.csv....................................................index
│       ├── exon1.bed.....................................................index
│       ├── exon2.bed.....................................................index
│       ├── exon3.bed.....................................................index
│       ├── intron.bed....................................................index
│       ├── splice_sites.csv...........................................validate
│       └── validated..................................................validate
│           └── events.csv.............................................validate
├── junctions.............................................................index
│   ├── metadata.csv......................................................index
│   └── reads.csv.........................................................index
└── psi.....................................................................psi
    ├── mxe.................................................................psi
    |   ├── psi.csv.........................................................psi
    │   └── summary.csv.....................................................psi
    ├── outrigger_psi.csv...................................................psi
    └── se..................................................................psi
        ├── psi.csv.........................................................psi
        └── summary.csv.....................................................psi

10 directories, 26 files

Proposal

Here's a proposed, flat structure of the outrigger_output folder:

$ ls outrigger_output
all_events.psi.csv
all_events.summary.csv
gencode.vM10.annotation.gtf
gencode.vM10.annotation.gtf.db
gencode.vM10.annotation.novel_exons.gtf
mxe.event.bed
mxe.events.csv
mxe.events.validated.csv
mxe.exon1.bed
mxe.exon2.bed
mxe.exon3.bed
mxe.exon4.bed
mxe.intron.bed
mxe.psi.csv
mxe.summary.csv
mxe.splice_sites.csv
se.event.bed
se.events.csv
se.events.validated.csv
se.exon1.bed
se.exon2.bed
se.exon3.bed
se.intron.bed
se.psi.csv
se.summary.csv
se.splice_sites.csv
junctions.csv.......................................[replacing `junctions/metadata.csv`]
junction_reads.csv

@alaindomissy what do you think?

Versions

This would greatly change the usability and would be a 2.0.0 release.

EDIT: No exon4 for SE events .. copy/paste mistake

olgabot commented 7 years ago

I guess the validated exons should also be explicit:

$ ls outrigger_output
all_events.psi.csv
all_events.summary.csv
gencode.vM10.annotation.gtf
gencode.vM10.annotation.gtf.db
gencode.vM10.annotation.novel_exons.gtf
mxe.event.bed
mxe.event.validated.bed
mxe.events.csv
mxe.events.validated.csv
mxe.exon1.bed
mxe.exon1.validated.bed
mxe.exon2.bed
mxe.exon2.validated.bed
mxe.exon3.bed
mxe.exon3.validated.bed
mxe.exon4.bed
mxe.exon4.validated.bed
mxe.intron.bed
mxe.intron.validated.bed
mxe.psi.csv
mxe.summary.csv
mxe.splice_sites.csv
se.event.bed
se.event.validated.bed
se.events.csv
se.events.validated.csv
se.exon1.bed
se.exon1.validated.bed
se.exon2.bed
se.exon2.validated.bed
se.exon3.bed
se.exon3.validated.bed
se.intron.bed
se.psi.csv
se.summary.csv
se.splice_sites.csv
junctions.csv.......................................[replacing `junctions/metadata.csv`]
junction_reads.csv