Closed georgiiprovisor closed 3 years ago
Oh, it totally not work with hg38, i think. SNPs not matching with CNV when I choose hg38, but with hg19 all looks fine Is it possible to run CNVFilteR on samples with hg38 reference?
Hi @georgiiprovisor
Thanks for reporting the issue. You are right, the function plotAllCNVs()
only works for the "hg19" genome. Let me do some checks and I will upload an update to make it work on any genome version.
Thank you! You will update only plotAllCNVs function or all functions?
After finishing the checks, I will update any function that did not work with the hg38 genome. I expect to do so during next week.
Hi @georgiiprovisor
I have aligned the AK1-genome sample against the HG38 reference. After multiple checks, the only function that didn't work was plotAllCNVs()
. I will update it soon so CNVs will be plotted as expected:
So, all the other functions worked for the HG38 genome. Please, remember to set genome = "hg38"
in the loadCNVcalls()
and loadVCFs()
functions because "hg19" is the default value.
Hello!
I run this command
cnvs060126NORMAL <- loadCNVcalls(cnvs.file = '/home/georgii/Nsc/CNV/CNvfilter_tests/060126_normal.tsv', chr.column = "chromosome", start.column = "start", end.column = "end", cnv.column = "CNV", deletion = "Del", duplication = "Dup", genome = "hg38", sample.name = '06_0126_normal', ignore.unexpected.rows = TRUE)
vcfs_060126_normal <- loadVCFs('/home/georgii/Nsc/CNV/CNvfilter_tests/06-0126_normal_2nd_ed_recal.vcf', cnvs.gr = cnvs060126NORMAL, vcf.source = "HaplotypeCaller", genome = "hg38")
results06_0126 <- filterCNVs(cnvs060126NORMAL, vcfs_060126_normal)
But in results06_0126[["cnvs"]]@elementMetadata
no SNPs connected with CNV
DataFrame with 230 rows and 10 columns
cnv sample cnv.id filter n.total.variants n.hm.variants n.ht.discard.CNV n.ht.confirm.CNV ht.pct score
Please, could you provide an example of a CNVs file and a VCF file for which the function does not work as expected? In my HG38 example SNPs overlap correctly with the CNVs, but I wonder if I am missing something that I cannot see with my example.
(By the way, remember that the loadVCFs()
function will discard SNVs with a min total depth < 10, by default).
Thanks :)
060126_normal.tsv.gz 06-0126_normal_2nd_ed_recal__.vcf.gz
I attach full files (vcf and tsv with CNV). Few SNPs with DP > 10 should overlap 1-st Del from tsv file (for example chr1:14653 ; chr1:873251 ; chr1:931131)
Thank you for answering :)
Hi,
I discovered what is happening. In your loadCNVcalls()
call you indicated that the sample.name
is 06_0126_normal
. However, when loading the VCF (loadVCFs()
) CNVfilteR looks into the the VCF and extract the sample name: TCGA-06-0126-10A-01D-1490-08
is the one that appears in that file. So, if you indicate the correct sample name (TCGA-06-0126-10A-01D-1490-08) when loading the CNVs, it should work. I will modify the code to show a Warning message when the expected sample name was not found. This is the code I used:
cnvs.gr <- loadCNVcalls(cnvs.file = cnvsPath, chr.column = "chromosome", start.column = "start", end.column = "end", cnv.column = "CNV", deletion = "Del", duplication = "Dup", genome = "hg38", sample.name = 'TCGA-06-0126-10A-01D-1490-08', ignore.unexpected.rows = TRUE)
vcf <- loadVCFs(vcfPath, cnvs.gr = cnvs.gr, vcf.source = "HaplotypeCaller", genome = "hg38")
res <- filterCNVs(cnvs.gr, vcf, verbose = T)
And if you plot first CNV by doing plotVariantsForCNV(res, "1")
, you should obtain:
Remember that you can explore the results in res$cnvs
and res$variantsForEachCNV
.
Hope this helps.
José Marcos.
Oh, it was so stupid fail from my side. Sorry :) Thanks a lot!
Hi, CNVfilteR pacakge has been updated to version 1.5.1. Among other fixes, plotAllCNVs()
allows the genome
parameter.
Ошибка в mergeNamedAtomicVectors(genome(x), genome(y), what = c("sequence", : sequences chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY have incompatible genomes: