Clinical-Genomics / demultiplexing

To keep scripts associated with execution of the Illumina demultiplexing pipeline
5 stars 0 forks source link

Add reverse complementing index 2 in samplesheet based on Novaseq Control Software version and reagent kit version #123

Closed barrystokman closed 3 years ago

barrystokman commented 3 years ago

This PR adds constructing a novaseq samplesheet based on new run parameters: Novaseq Control Software version and reagent kit version

This PR solves issue https://github.com/Clinical-Genomics/demultiplexing/issues/111

How to prepare for test:

How to test:

Preconditions:

Test Cases:

Expected test outcome:

Review:

This version is a:

barrystokman commented 3 years ago

TC1:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -p -L HGJJKDSXY > /home/hiseq.clinical/STAGE/novaseq/runs/201111_A00621_0305_BHGJJKDSXY/TC1_samplesheet.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC1_samplesheet.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,TTCAGGTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,TTGGTGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,CGCGGTTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,GACGAGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,TATAACCTAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,AGACTTGGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,CTTAAGCCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,TCCGGATTAC,300186,N,R1,script,300186
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff HGJJKDSXY_prod_samplesheet.csv TC1_samplesheet.csv 

:heavy_check_mark:

barrystokman commented 3 years ago

TC2:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -p -L HGJJKDSXY > /home/hiseq.clinical/STAGE/novaseq/runs/201111_A00621_0305_BHGJJKDSXY/TC2_samplesheet.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC2_samplesheet.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,TTCAGGTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,TTGGTGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,CGCGGTTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,GACGAGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,TATAACCTAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,AGACTTGGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,CTTAAGCCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,TCCGGATTAC,300186,N,R1,script,300186
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff HGJJKDSXY_prod_samplesheet.csv TC2_samplesheet.csv 

:heavy_check_mark:

barrystokman commented 3 years ago

TC3:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -p -L HGJJKDSXY > /home/hiseq.clinical/STAGE/novaseq/runs/201111_A00621_0305_BHGJJKDSXY/TC3_samplesheet.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC3_samplesheet.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,GACCTGAAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,CTCACCAAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,GAACCGCGGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,CTCTCGTCGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,AGGTTATAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,CCAAGTCTGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,GGCTTAAGGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,AATCCGGAGT,300186,N,R1,script,300186

:heavy_check_mark:

barrystokman commented 3 years ago

TC4:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -p -L HGJJKDSXY > /home/hiseq.clinical/STAGE/novaseq/runs/201111_A00621_0305_BHGJJKDSXY/TC4_samplesheet.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC4_samplesheet.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,GACCTGAA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,CTCACCAA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,GAACCGCG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,CTCTCGTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,AGGTTATA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,CCAAGTCT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,GGCTTAAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,AATCCGGA,300186,N,R1,script,300186

:heavy_check_mark:

barrystokman commented 3 years ago

TC5:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -L HGJJKDSXY                                                                                                                                                        
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,TTCAGGTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,TTGGTGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,CGCGGTTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,GACGAGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,TATAACCT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,AGACTTGG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,CTTAAGCC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,TCCGGATT,300186,N,R1,script,300186

:heavy_check_mark:

barrystokman commented 3 years ago

TC6:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova -p HGJJKDSXY
Please specify an index length when using the pad option! Use --longest or --indexlength.
Aborted!

:heavy_check_mark:

barrystokman commented 3 years ago

TC7:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux sheet fetch -a nova HGJJKDSXY                                                                                                                                                           
[Data]                                                                                                                                                                                                                                                        
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project                                                                                                                                                                          
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,TTCAGGTC,300186,N,R1,script,300186                                                                                                                                                                                        
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,TTGGTGAG,300186,N,R1,script,300186                                                                                                                                                                                       
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,CGCGGTTC,300186,N,R1,script,300186                                                                                                                                                                                       
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,GACGAGAG,300186,N,R1,script,300186                                                                                                                                                                                        
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,TATAACCT,300186,N,R1,script,300186                                                                                                                                                                                       
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,AGACTTGG,300186,N,R1,script,300186                                                                                                                                                                                        
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,CTTAAGCC,300186,N,R1,script,300186                                                                                                                                                                                       
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,TCCGGATT,300186,N,R1,script,300186                            

:heavy_check_mark:

barrystokman commented 3 years ago

To be deployed together with https://github.com/Clinical-Genomics/servers/pull/461

barrystokman commented 3 years ago

Tests and test fixtures are WIP

barrystokman commented 3 years ago

RETESTS:

TC1:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -p -L HGJJKDSXY > TC1_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC1_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,TTCAGGTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,TTGGTGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,CGCGGTTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,GACGAGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,TATAACCTAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,AGACTTGGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,CTTAAGCCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,TCCGGATTAC,300186,N,R1,script,300186
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff TC1_samplesheet.csv TC1_samplesheet_RETEST.csv 

:heavy_check_mark:

TC2:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -p -L HGJJKDSXY > TC2_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC2_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,TTCAGGTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,TTGGTGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,CGCGGTTCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,GACGAGAGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,TATAACCTAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,AGACTTGGAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,CTTAAGCCAC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,TCCGGATTAC,300186,N,R1,script,300186
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff TC2_samplesheet.csv TC2_samplesheet_RETEST.csv 

:heavy_check_mark:

TC3:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -p -L HGJJKDSXY > TC3_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC3_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAAAT,GACCTGAAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAAAT,CTCACCAAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACTAT,GAACCGCGGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGCAT,CTCTCGTCGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTCAT,AGGTTATAGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTGAT,CCAAGTCTGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGTAT,GGCTTAAGGT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAACAT,AATCCGGAGT,300186,N,R1,script,300186
You have mail in /var/spool/mail/hiseq.clinical
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff TC3_samplesheet.csv TC3_samplesheet_RETEST.csv 

:heavy_check_mark:

TC4:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -p -L HGJJKDSXY > TC4_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC4_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,GACCTGAA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,CTCACCAA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,GAACCGCG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,CTCTCGTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,AGGTTATA,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,CCAAGTCT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,GGCTTAAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,AATCCGGA,300186,N,R1,script,300186
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ diff TC4_samplesheet.csv TC4_samplesheet_RETEST.csv 

TC5:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -L HGJJKDSXY > TC5_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC5_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,TTCAGGTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,TTGGTGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,CGCGGTTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,GACGAGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,TATAACCT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,AGACTTGG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,CTTAAGCC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,TCCGGATT,300186,N,R1,script,300186

:heavy_check_mark:

TC6:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova -p HGJJKDSXY
Please specify an index length when using the pad option! Use --longest or --index_length.
Aborted!

:heavy_check_mark:

TC7:

(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ demux-stage sheet fetch -a nova HGJJKDSXY > TC7_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC7_samplesheet_RETEST.csv
(stage)[hiseq.clinical@thalamus 201111_A00621_0305_BHGJJKDSXY]$ less TC7_samplesheet_RETEST.csv | head
[Data]
FCID,Lane,SampleID,SampleRef,index,index2,SampleName,Control,Recipe,Operator,Project
HGJJKDSXY,1,ACC7198A3,hg19,CGTTAGAA,TTCAGGTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A16,hg19,GACCTGAA,TTGGTGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A18,hg19,TCTCTACT,CGCGGTTC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A7,hg19,GATTCTGC,GACGAGAG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A20,hg19,CTCTCGTC,TATAACCT,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A9,hg19,TCGTAGTG,AGACTTGG,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A13,hg19,TAAGTGGT,CTTAAGCC,300186,N,R1,script,300186
HGJJKDSXY,1,ACC7198A15,hg19,CGGACAAC,TCCGGATT,300186,N,R1,script,300186

:heavy_check_mark:

barrystokman commented 3 years ago

Bumped: image

Deployed on thalamus: image