Closed k-florek closed 7 years ago
Hello,
I personally don't work with these VCF files with not gt section. So I don't have very much experience running them through vcfR. But if you're willing to help me...
This is not actually an issue with reading the VCF data into R. I like to think I have that under control and have unit tests to help with that. You demonstrate above that you successfully read your data into R. The issues you report are downstream processing with the functions head()
and proc_chrom()
. Your interpretation of the issue appears correct in that the functions are trying to access portions of the gt regions that do not exist in your data.
I have never used lofreq, so could you please validate something for me? You report the following works.
> vcf <- read.vcfR("file.vcf", verbose = TRUE)
Let's validate the state of the gt slot. you should get the following.
vcf@gt
<0 x 0 matrix>
The initial state of the gt slot should be a matrix with zero rows and zero columns. If this is what you are seeing than I can modify functions to check for this prior to trying to manipulate that slot. And that should give you functionality. Can you try that and let me know how it went? Thanks!
The functionality of vcfR is great so I'll definitely try to help the best I can!
The state of the gt slot is a 0x0 matrix.
vcf@gt
<0 x 0 matrix>
Thanks for validating that. This suggests that I should be able to reproduce the behaviour you reported as follows.
data("vcfR_example")
vcf@gt <- matrix(nrow=0, ncol=0)
I think I've addressed the head()
issue in my development version. But I can't reproduce the proc_chromR()
issue.
myChrom <- proc.chromR(vcf)
Error: class(x) == "chromR" is not TRUE
This is because one of the first things proc.chromR()
does is to check the class you gave it and bail out if it isn't chromR
. So I think you've left out a step. My guess would be as follows.
data("vcfR_example")
vcf@gt <- matrix(nrow=0, ncol=0)
class(vcf)
myChrom <- create.chromR(vcf, verbose = FALSE)
myChrom <- proc.chromR(myChrom)
Error in x@vcf@gt[x@var.info$mask, , drop = FALSE] :
(subscript) logical subscript too long
In addition: Warning messages:
1: In proc.chromR(myChrom) : seq slot is NULL.
2: In proc.chromR(myChrom) : annotation slot has no rows.
3: In proc.chromR(myChrom) :
seq slot is NULL, chromosome representation not made (seq2rects).
4: In proc.chromR(myChrom) :
seq slot is NULL, chromosome representation not made (seq2rects, chars=n).
This is an error that appears associated with the lack of a gt slot. And I'll work on addressing it. But it does not appear to be the same as what you have reported. Could you please review the steps that led you to your error? Can you provide me with an example that will help me reproduce this behaviour on my end? You can either using the data I presented above or send me your own.
Thanks!
In my description of the issue I left out the step where I created the chromR object since it didn't return any errors.
The reason our error messages are different is because in the create.chromR() function I am specifying the vcf file, reference sequence, and annotations. Here is my full script. I changed the name of the variables when I submitted my initial ticket to reduce confusion.
library(ape)
library(vcfR)
mag_pure <- read.vcfR("alignments/PNUSAS024723.vcf",verbose = TRUE)
hand <- read.vcfR("alignments/PNUSAS024723H.vcf",verbose = TRUE)
reference <- read.dna("senterica_thompson_atcc8391.fna",format = "fasta")
annotations <- read.table("senterica_thompson_atcc8391.gff",sep="\t",quote="")
chrom_mag_pure <- create.chromR(name="Magna_Pure", vcf=mag_pure,seq=reference,ann=annotations)
chrom_hand <- create.chromR(name="Hand_purification",vcf=hand,seq=reference,ann=annotations)
#look at variants
plot(chrom_mag_pure)
plot(chrom_hand)
#process chromR object
chrom_mag_pure <- proc.chromR(chrom_mag_pure, verbose = TRUE)
chrom_hand <- proc.chromR(chrom_hand,verbose = TRUE)
chromoqc(chrom_mag_pure, dp.alpha=20)
chromoqc(chrom_hand, dp.alpha=20)
Hopefully this will help! I've attached one of the vcf files in case it's helpful. PNUSAS024723.vcf.zip
I think that's just what I needed! I now have the following.
library(vcfR)
data("vcfR_example")
vcf@gt <- matrix(nrow=0, ncol=0)
myChrom <- create.chromR(vcf=vcf, seq = dna, ann = gff, verbose = FALSE)
myChrom <- proc.chromR(myChrom)
And it now throws no errors. In hindsight, the error you reported and the one I generated appear to be the same. The difference in output was due to the lack of seq and ann.
This now appears to work and plots a chromoqc()
as well.
These changes are now on the master branch at GitHub. If you'd like to try it you can use the following.
devtools::install_github(repo="knausb/vcfR")
To be honest, when I started this project I did not realize you could have a valid VCF that had no gt. But this is supported by the VCF specification, so I feel obligated to support it. If you notice anything else please let me know.
Thanks!
Works beautifully! Thanks a bunch, I'll let you know if anything else breaks.
I'm trying to use vcfR to read in a vcf file generated by lofreq and I'm seem to be having issues when trying to use the resulting vcfR class. I think it might be related to missing gt region in the vcf from lofreq, but I'm not sure.
The output of the class is:
and if I try to look at the header:
similarly when I try to run proc.chrmR I get another error:
here is my sessionInfo: