Open charbel2998 opened 1 year ago
Hi @charbel2998 ,
For the following
vcfR object includes more than one chromosome (CHROM) Subsetting to the first chromosome
please see the following link.
https://knausb.github.io/vcfR_documentation/subset_data_to_1chrom.html
The chromR object is intended for a single chromosome (sequence).
For the WARNING
In create.chromR(name = "S17SNP", vcf = vcf, seq = genome, ann = ref, : Names in variant data and annotation data do not match perfectly. If you choose to proceed, we'll do our best to match the data. But prepare yourself for unexpected results.
Please see below.
https://knausb.github.io/vcfR_documentation/chromR_object.html
It has been my experience that the names do not always match. Sometimes we download them like this. Sometime a processing step we use creates the issue. For example, I believe bwa (and other aligners) tends to truncate FASTA sequence names on whitespace, resulting in sequence names you may not have expected. You can validate this by comparing the names in your FASTA (e.g., grep "^>" my_data.FASTA) with the [BS]AM header (e.g., samtools view -H my_data.BAM). This is a WARNING, meaning we're trying to make you aware of this issue. If you are sure you understand this you should be able to proceed. If you are incorrect and proceed anyway, you may be creating problems for yourself downstream. I suggest you take a minute to validate that the input order is identical and if so, proceed with caution.
Does that provide you with a solution? Brian
Hello, I am writing this code:
library(vcfR) library(ape)
input vcf Data S17 RES SNPs
vcf = read.vcfR("~/Desktop/Final_Table/RES_S17_readyAnalysis_SNP.vcf") genome = ape::read.dna("~/Desktop/Resource/hg38.fa", format = "fasta") ref = read.table("~/Desktop/GRCh38_latest_genomic.gff", sep = '\t', quote = "")
creating a ChromeR object
chrom <- create.chromR(name="S17SNP", vcf=vcf, seq=genome, ann=ref, verbose=TRUE) plot(chrom)
I'm getting this error that I can't explain: vcfR object includes more than one chromosome (CHROM) Subsetting to the first chromosome Warning message: In create.chromR(name = "S17SNP", vcf = vcf, seq = genome, ann = ref, : Names in variant data and annotation data do not match perfectly. If you choose to proceed, we'll do our best to match the data. But prepare yourself for unexpected results.
the fasta file is the same one I used to do the alignment and the annotation.
Does anyone have any input?