UCSF-Costello-Lab / LG3_Pipeline

The original LG3 pipeline
https://github.com/UCSF-Costello-Lab/LG3_Pipeline
0 stars 0 forks source link

TESTS/DEMO: How to run merge + re-recal? #38

Open HenrikBengtsson opened 6 years ago

HenrikBengtsson commented 6 years ago

What are the instructions for running _run_Merge and _run_Recal_pass2? Where is the (new?) input/raw data?

HenrikBengtsson commented 6 years ago

Which set of steps is meant to be used when merging?

  1. _run_Trim
  2. _run_Align_gz
  3. _run_Merge
  4. _run_Recal_pass2
  5. _run_Pindel && _run_MutDet

and

  1. _run_Trim
  2. _run_Align_gz
  3. _run_Recal <== difference
  4. _run_Merge
  5. _run_Recal_pass2
  6. _run_Pindel && _run_MutDet

Also, you said you prepared FASTQ files to run these tests - if so, where are they?

ivan108 commented 6 years ago

The second sequence is correct. I don't have prepared fastq files for this test, I just used the same samples from Patient157t. On the _run_Merge step I merged recalibrated Z00600t + Z00601t samples and named the resulting files as Y00600t.* Then I replaced conversion file with runs_demo/patient_ID_conversions.test2.tsv and proceed with _run_Recal_pass2 using Z00599t (as Normal) and Y00600t. And then mutation calls are as usual..

shuntsman-ucsf commented 6 years ago

So is the above "sequence 2" still the correct order and is it ready for me to try? Is merge_qc not used? I currently have all files through “_trim” and about to run “_align”.

ivan108 commented 6 years ago

The scenario with merging was tested, but not as extensively as the main pipeline.

Lets consider an example patient with 3 runs: S1, S2a, S2b, where S2a and S2b are from the same sample, which we want to merge.

At phase 1 we treat all runs as independent samples, and we use S1, S2a and S2b

_run_Trim
_run_Align_gz
_run_Recal

then we run _run_Merge or _run_Merge_QC which merges S2a + S2b => S2. _run_Merge_QC does the same merging as _run_Merge and in addition does some QC analysis, choice is yours.

At phase 2 we run S1 and S2

_run_Recal_pass2
_run_Pindel
_run_MutDet
_run_PostMut

Important! For phase 2 we need a modified "conversion" file, where instead of S2a and S2b we have S2 sample for the same patient.