JEFworks-Lab / HoneyBADGER

HMM-integrated Bayesian approach for detecting CNV and LOH events from single-cell RNA-seq data
http://jef.works/HoneyBADGER/
GNU General Public License v3.0
95 stars 31 forks source link

refCount and altCount matrixes are everywhere zero #11

Closed elimereu closed 5 years ago

elimereu commented 5 years ago

Hi Jean,

I'm still at the initial phase of file preparation and it's weird that for several chromosomes I'm not getting any counts (counts=0) during the run of the function getSnpMats. That happens in both ref and alt, even though the coverage is not always zero. Do you think that is possible?

My VCF file is from Whole Genome Sequencing, so I expect all there are many more variants in the VCF than in bam files. I wanted to restrict to only axons but running these commands

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene exons <- exons(txdb)

withinRange <- function(rng) function(x) x filters <- FilterRules(list(isSNV = isSNV, withinRange = withinRange(exons)))

filt.vcf <- filterVcf(vcfFile, "hg19", "HaplotypeCaller_PASS_dbSNP_annot.vcf.bgz",index = TRUE ,filters=filters, verbose=TRUE)

didn't work on my VCF file. So, given I'm not usually working with VCF finally I used the entire VCF at the end.

Any suggestions is really appreciated.

Bests,

Elisabetta

JEFworks commented 5 years ago

Hi Elisabetta,

Hum, that’s strange. Can you please double check that the VCF file from your WGS is also aligned to hg19? Same with your bams?

When you say ‘didn’t work’ on your VCF files, do you mean there is an error or that the filt.vcf object is empty? Can you double check that your VCF file has SNVs annotated such that your filter rule is applicable? Some variant callers that output VCFs specifically throw away SNVs since people tend to be interested in deleterious mutations rather than common variants.

You can also try looking at the pileup of your bams in the IGV or something to double check that SNVs in your VCF file are indeed present.

Hope that helps, Jean

On Aug 7, 2018, at 9:25 AM, Elisabetta notifications@github.com<mailto:notifications@github.com> wrote:

Hi Jean,

I'm still at the initial phase of file preparation and it's weird that for several chromosomes I'm not getting any counts (counts=0) during the run of the function getSnpMats. That happens in both ref and alt, even though the coverage is not always zero. Do you think that is possible?

My VCF file is from Whole Genome Sequencing, so I expect all there are many more variants in the VCF than in bam files. I wanted to restrict to only axons but running these commands

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene exons <- exons(txdb)

withinRange <- function(rng) function(x) x filters <- FilterRules(list(isSNV = isSNV, withinRange = withinRange(exons)))

filt.vcf <- filterVcf(vcfFile, "hg19", "HaplotypeCaller_PASS_dbSNP_annot.vcf.bgz",index = TRUE ,filters=filters, verbose=TRUE)

didn't work on my VCF file. So, given I'm not usually working with VCF finally I used the entire VCF at the end.

Any suggestions is really appreciated.

Bests,

Elisabetta

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JEFworks_HoneyBADGER_issues_11&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=zpGA6gb1qzQT4FVssncTxRIbwizHp4fAvHMZq66iO8A&s=DYa-w6Pb0jWG9ANHkhhGN5vnWmAETtJhPVW3SrDoUV0&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AIj2SL-5F6UxQdhf-5FTNYC5xR-2D64-5FRGGkzGks5uOZVKgaJpZM4VyJ1i&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=zpGA6gb1qzQT4FVssncTxRIbwizHp4fAvHMZq66iO8A&s=c3--uC8wPryIMhSkIqqYlEoiwOKDBVCYe4eCw9BZrNY&e=.

elimereu commented 5 years ago

Hi Jean,

thanks for these suggestions. I double checked with IGV and finally I saw the problem was related to the creation of the GenomicsRanges object, because the metadata information from the VCF (ALT, REF etc..) was missing.

Thanks a lot,

Elisabetta