Closed mattbareno closed 1 year ago
You could try ?vcfR::maf()
to get a vector of frequencies and filter on that. Does that accomplish your goal?
In order to filter using the maf function, i had to convert the maf(chrom) to a dataframe, but how do i replace the values of the original object with this filtered dataframe?
For example, to attempt to filter based on 3 conditions (minimum read depth, mapping quality, and frequency) i used the following code:
But i cannot figure out this last part.
p.s. im relatively new to R so sorry for obvious things i may be confusing
Best,
Hi, when in doubt, simplify. Try one thing at a time. How about the following for MAF?
library(vcfR)
#>
#> ***** *** vcfR *** *****
#> This is vcfR 1.14.0
#> browseVignettes('vcfR') # Documentation
#> citation('vcfR') # Citation
#> ***** ***** ***** *****
data(vcfR_example)
vcf
#> ***** Object of Class vcfR *****
#> 18 samples
#> 1 CHROMs
#> 2,533 variants
#> Object size: 3.2 Mb
#> 8.497 percent missing data
#> ***** ***** *****
my_maf <- maf(vcf)
my_maf[1:3, ]
#> nAllele NA Count Frequency
#> Supercontig_1.50_2 24 6 1 0.04166667
#> Supercontig_1.50_246 22 7 5 0.22727273
#> Supercontig_1.50_549 6 15 2 0.33333333
vcf <- vcf[my_maf[, 'Frequency'] > 0.014, ]
vcf
#> ***** Object of Class vcfR *****
#> 18 samples
#> 1 CHROMs
#> 2,511 variants
#> Object size: 3.2 Mb
#> 8 percent missing data
#> ***** ***** *****
Created on 2023-03-21 with reprex v2.0.2
For DP and MQ I'd suggest something like the following.
https://knausb.github.io/vcfR_documentation/censoring_data.html
This will allow you to censor individual genotypes where the masker function works on row (variant)-wise summaries. I prefer working on the individual genotypes.
Does this provide you with a solution?
Yes thank you so much!
Hey! I was curious if there was an option to filter for minor allele frequency. Using the masker function i successfully filtered for a read depth and mapping quality threshold. The manual doesn't say anything about a maf option, even though a publication claimed to use vcfR for maf filtering (https://apsjournals.apsnet.org/doi/10.1094/MPMI-05-19-0131-R ; see methods).
Could you help me through this? Thank you,