Closed honzee closed 3 years ago
Dear Jay,
I have the same results with your vcf file. I have looked into it a bit.
What RNAseqCNV does is that it filters the detected SNVs based on their presence in dbSNP database at one point. For your file, this excludes the vast majority of SNVs and thus no MAF density plots are drawn.
I think this may be due to different reference genome versions. Do you know which genome reference did you use for generating this vcf file?
Thank you, Jan
Dear Jan, I have used hg38 for generating the vcf file.
Dear Jay,
I am sorry that I have not clearly stated, that this package is built upon GRch37. Therefore, for optimal analysis, the reference used should be GRch37. That said, there are a few options to go around this.
First, you can download your own dbSNP database version for hg38 and use this as a vector input for RNAseqCNV wrapper function. The format should look like this:
head(keepSNP) [1] "1-11012" "1-13110" "1-13116" "1-13118" "1-13273" "1-14464" RNAseqCNV_wrapper(config = "C:/Users/xaxa/config.txt", metadata = "C:/Users/xaxa/metadata", snv_format = "vcf", keptSNP = keepSNP)
Secondly, I updated the RNAseqCNV again in a way that it accepts argument: keptSNP = FALSE. In this way filtering through dbSNP database is avoided. Unfortunately, the quality of SNVs goes down significantly. To compensate for this, you should probably increase the minimum depth of SNVs that are accepted and lower the accepted MAF range. Use approximately this command after updating RNAseqCNV:
RNAseqCNV_wrapper(config = "C:/Users/xaxa/config.txt", metadata = "C:/Users/xaxa/metadata", snv_format = "vcf", keptSNP = FALSE, mafRange = c(0.25, 0.75), minDepth = 100)
I tried this for your file and it produced results, that could be worked with. However, the accuracy of prediction by RNAseqCNV will be lower and the quality of MAF density graphs will suffer also.
Anyway, looking forward, I am planning to implement also hg38.
Thank you for your patience Jay and I hope that RNAseqCNV will be helpful in some way for you. If you have any other questions, please do not hesitate to ask.
Best, Jan
Dear Jan, Thanks for your explanation but I didn't understood this-
First, you can download your own dbSNP database version for hg38 and use this as a vector input for RNAseqCNV wrapper function. The format should look like this:
head(keepSNP)
[1] "1-11012" "1-13110" "1-13116" "1-13118" "1-13273" "1-14464"
RNAseqCNV_wrapper(config = "C:/Users/xaxa/config.txt", metadata = "C:/Users/xaxa/metadata", snv_format = "vcf", keptSNP = keepSNP)
Can you explain it further?
Thanks Jay
Yes. The RNAseqCNV_wrapper function has a parameter called keptSNP. This parameter accepts a character vector of SNP identifiers e.g.: "1-11012" "1-13110" "1-13116"..
The SNPs have different identifiers based on the genome version. You can download the hg38 dbSNP database through the instructions here: http://genome.ucsc.edu/FAQ/FAQdownloads.html#snp And subsequently, extract the SNPs into a vector for input for RNAseqCNV_wrapper.
I hope this helps. Best, Jan
Dear Jan Sorry for a late reply. I will download the hg38 dbSNP database and let you know if I get any issue.
Thanks, Jay
The issue is being solved in: https://github.com/honzee/RNAseqCNV/issues/7#issue-792763374
From JAYRJPT:
Dear Jan, Thanks for the update. I am now able to run the package. However I like to inform you that in the final figure lower panel of the graph( MAF frequency) is mising. Is this the issue with my file or with the package? I have sent you the same file. Did you also faced the same issue? I have also got the following warning-
Warning messages: 1: In dir.create(path = chr_dir) : '/media/deepak/Deepak4T/NEW_SAMPLE/SAMPLE_5/RNASEQCNV/RNASEQCNV/output_dir/BALL5' already exists 2: Graphs cannot be vertically aligned unless the axis parameter is set. Placing graphs unaligned. 3: In max(peak_max) : no non-missing arguments to max; returning -Inf 4: Position guide is perpendicular to the intended axis. Did you mean to specify a different guide position?
Regards, Jay