Closed Truongphikt closed 3 months ago
The af can accept a tab separated text file with five columns with header: chr pos ref alt af
The af can accept a tab separated text file with five columns with header:
chr pos ref alt af
After tried a while when I additional AF file to the repo's test set, I got an error
Error in `[.data.frame`(af, , "chr") : undefined columns selected
Calls: vcfcomp ... tryCatchOne -> <Anonymous> -> paste0 -> [ -> [.data.frame
Execution halted
The command:
rawvcf = "to_the_path/20201028_CCDG_14151_B01_GRM_WGS_2020-08-05_chr21.recalibrated_variants.vcf.gz"
phasedvcf= "to_the_path/1kGP_high_coverage_Illumina.chr21.filtered.SNV_INDEL_SV_phased_panel.vcf.gz"
maf_file= "to_the_path/21_maf_seperate_chr.txt"
res <- vcfcomp(test = rawvcf, truth = phasedvcf, info=FALSE,
stats = "nrc", region = "chr21:1-5100000",
af=maf_file,
formats = c("GT","GT"))
My "af file" is something like (maf_file):
$ cat 21_maf_seperate_chr.txt | head
chr21 5030578 C T 0.0115553
chr21 5030588 T C 0.00827608
chr21 5030596 A G 0.000312305
chr21 5030641 C T 0
chr21 5030673 G A 0.000312305
chr21 5030912 CA C 0.00218613
You need to add a header line. See my comment above.
@Zilong-Li Thanks for your help.
A little note for other people with the same use case as me is that the format of the chr column
is the Prefix Format (chr1, chr2, etc ..). If you use Numerical Format (1, 2, 3, ...), you will meet an error:
Error in tapply(1:length(x), x, function(w) { :
arguments must have same length
Calls: vcfcomp -> concordance_by_freq -> tapply
Execution halted
The idea is that we have to use the consistent format to match the id in af file to VCF file. I'll update the docs at some point.
When I read docs at
vcfcomp
, I see theaf
argument.Could someone give me more info about the "allele frequency text file" and "RDS" format? Maybe It can help me use my "exterior MAF file" to overwrite MAF in true VCF.
My "exterior MAF" format: