Open vlakhujani opened 4 years ago
(1) we use version 3 of the GATK software from Broad Institute to compute the Depth of Coverage on a given bam file.
@theodorc
I am looking at the usage doc. Where is the GATK v3 file used as input ?
Additionally, how do I create panel file?
Exon_Target Gene_Exon Call_CNV RefSeq
1:1220087-1220186 SNP_1 N rs2144440
1:3083663-3083762 SNP_2 N rs2651899
1:3611843-3611942 SNP_3 N rs3765731
1:6279321-6279420 RNF207-001_18 N rs846111
1:8487274-8487373 SNP_4 N rs301797
1:11850737-11850955 MTHFR-001_11 Y NM_005957_cds_0
1:11851264-11851383 MTHFR-001_10 Y NM_005957_cds_1
1:11852335-11852436 MTHFR-001_9 Y NM_005957_cds_2
1:11853964-11854146 MTHFR-001_8 Y NM_005957_cds_3
The Gene_Exon
column contains what ? SNP Ids or gene / exon ids? Also, the "RefSeq" column contains dbsnp rs ids ? is that correct ? I also see NM ids (transcript ids)?
And finally, Call_CNVs
column contains yes/no values - how to make that decision?
Sorry for the late response. Hope the comments below helps.
For GATK, see the config file. In there is variable to specify the directory (and file name format) where you have the GATK Depth of Coverage file: GATKDIR=GATK_DoC/[SAMPLE_FCLBC].DATA.sample_interval_summary
Panel file is created by yourself in your favorite editor. It is usually based on the capture designed you used for the sequencing. For example, a cancer panel will contain genes for cancer and their exon target coordinates etc...
The Gene_Exon column is the name of the target exon used. In the example, I used gene MTHFR and -001 for the transcript id, and _11 for exon. The same idea for RefSeq column.
Finally the Call_CNV is designates whether you want to include this given target in the analysis. Usually you say N if you know somehow this target is not reliable when the data is produced (ie. target is too small or data is known to be noisy).
Where can I find more information on how to create the panel and the sample fies?
I went through the paper and it says
The first file is not mentioned in the github ReadMe
I am really confused. Please help.