openvax / neoantigen-vaccine-pipeline

Bioinformatics pipeline for selecting patient-specific cancer neoantigen vaccines
Apache License 2.0
75 stars 25 forks source link

Alternative config format for sequencing data #118

Open iskandr opened 6 years ago

iskandr commented 6 years ago

Move type: paired-end to top-level sequencing-read-type or some other similar name. Could also optionally have it per normal/tumor/RNA in case those end up being different for some reason, but seem unnecessary to have it for each "fragment".

Also, anything we do to hide the term "fragment" from the user would be good, since it's confusing the context of sequencing.

One possibility, instead of

rna:
    - fragment_id: L001
      type: paired-end
      r1: /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R1_001.fastq.gz
      r2: /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R2_001.fastq.gz
    - fragment_id: L002
      type: paired-end
      r1: /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R1_001.fastq.gz
      r2: /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R2_001.fastq.gz

have:

rna:
  type: paired-end
  files: 
     L001:
        - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R1_001.fastq.gz
        - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R2_001.fastq.gz
     L002:
        - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R1_001.fastq.gz
        - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R2_001.fastq.gz

or:

sequencing-read-type: paired-end
rna:
   L001:
      - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R1_001.fastq.gz
      - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L001_R2_001.fastq.gz
   L002:
      - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R1_001.fastq.gz
      - /inputs/pgv001-018/TD00410_PGV001_018_RNA/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002/PGV001_018_TU_RNA_01_01_CAGATC_HKTYNBCX2_L002_R2_001.fastq.gz