xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary
GNU General Public License v3.0
21 stars 17 forks source link

Errors when running Sliding_Window() #6

Closed alohasiqi closed 1 year ago

alohasiqi commented 1 year ago

Hello,

I'm testing 2k variant sites among 1373 individuals using sliding_window(). I first set window_length 6k, which seems to run twice but returns different numbers of samples and variants each time, then stop with "rare variant number less than 2" error (see below). Then I set window_length 2k, this time it throws out an error "Error in if (sum(is.in) >= 2) { : missing value where TRUE/FALSE needed".

Why it returns different errors when setting different window lengths? Why it didn't read 2000 variants and 1373 individuals initially in my first test? Is it a format thing with genofile? What do you think? Thanks

Sliding_Window(chr=3,start_loc=11919,end_loc=1239497,sliding_window_length=6000, genofile=genofile,obj_nullmodel=obj_nullmodel)

of selected samples: 1,372

of selected variants: 0

of selected samples: 1,373

of selected variants: 2,000

Error in Sliding_Window_Single(chr = chr, start_loc = start_loc, end_loc = end_loc, : Number of rare variant in the set is less than 2!

Sliding_Window(chr=3,start_loc=11919,end_loc=1239497,sliding_window_length=2000, genofile=genofile,obj_nullmodel=obj_nullmodel)

of selected samples: 1,372

of selected variants: 0

Error in if (sum(is.in) >= 2) { : missing value where TRUE/FALSE needed

xihaoli commented 1 year ago

Hi,

We have not encountered this issue before, so we are wondering if this issue is related to the GDS file. Could you please paste the information when you run the command genofile in R?

Best, Xihao

alohasiqi commented 1 year ago

Here it is. Let me know if you need other information. Thanks!

genofile Object of class "SeqVarGDSClass" File: 1374pt_merged_chr3_maffilt.gds (413.7K)

xihaoli commented 1 year ago

Hi @alohasiqi,

Thanks for sharing the information about your AGDS file. One thing I noticed is that the second dimension of the genotype data (2000 in your case) should be the same as the length of each functional annotation (1993 in your case). The order of variants in genotype and functional annotation should also be aligned. Given that the dimensions are mismatched, could you please double check how you generated this AGDS file? The FAVORannotator provides a workflow to automatically annotate the GDS file and generate an AGDS file with aligned genotype and functional annotations.

Best, Xihao

alohasiqi commented 1 year ago

Hi Xihao,

Thanks for the instructions! I have double-checked my input files and corrected them but still have

Error in Sliding_Window_Multiple(chr = chr, start_loc = start_loc, end_loc = end_loc, : Number of rare variant in the set is less than 2!

when running

results <- try(Sliding_Window(chr="chr22",start_loc=start_loc_sub,end_loc=end_loc_sub,
                                  sliding_window_length=sliding_window_length,type="multiple",
                                  genofile=genofile,obj_nullmodel=obj_nullmodel,
                                  rare_maf_cutoff=1,rv_num_cutoff=0, Annotation_name=Annotation_name))
# of selected samples: 178
# of selected variants: 0
# of selected samples: 178
# of selected variants: 13,341

Can you help me figure out if I need to pay attention to anything else? I'm also not sure if this has to do with the Annotation_name_catalog and Annotation_name in my sliding_window arguments as I only put a subset of annotations from the FAVORannotator. Thanks!

Here is my gds file for your reference.

genofile
Object of class "SeqVarGDSClass"
File: /illumina_chr22_hg38_liftover.gds (16.0M)
+    [  ] *
|--+ description   [  ] *
|--+ sample.id   { Str8 178 LZMA_ra(34.1%), 493B } *
|--+ variant.id   { Int32 13341 LZMA_ra(9.18%), 4.8K } *
|--+ position   { Int32 13341 LZMA_ra(42.4%), 22.1K } *
|--+ chromosome   { Str8 13341 LZMA_ra(0.45%), 189B } *
|--+ allele   { Str8 13341 LZMA_ra(19.6%), 12.0K } *
|--+ genotype   [  ] *
|  |--+ data   { Bit2 2x178x13473 LZMA_ra(10.7%), 125.3K } *
|  |--+ extra.index   { Int32 3x0 LZMA_ra, 18B } *
|  \--+ extra   { Int16 0 LZMA_ra, 18B }
|--+ phase   [  ]
|  |--+ data   { Bit1 178x13341 LZMA_ra(0.06%), 197B } *
|  |--+ extra.index   { Int32 3x0 LZMA_ra, 18B } *
|  \--+ extra   { Bit1 0 LZMA_ra, 18B }
|--+ annotation   [  ]
|  |--+ id   { Str8 13341 LZMA_ra(0.97%), 137B } *
|  |--+ qual   { Float32 13341 LZMA_ra(76.0%), 39.6K } *
|  |--+ filter   { Int32,factor 13341 LZMA_ra(0.30%), 165B } *
|  |--+ info   [  ]
|  |  |--+ AC   { Int32 14055 LZMA_ra(18.0%), 9.9K } *
|  |  |--+ AF   { Float32 14055 LZMA_ra(24.4%), 13.4K } *
|  |  |--+ AN   { Int32 13341 LZMA_ra(2.90%), 1.5K } *
|  |  |--+ BaseQRankSum   { Float32 13341 LZMA_ra(46.5%), 24.2K } *
|  |  |--+ ClippingRankSum   { Float32 13341 LZMA_ra(0.60%), 329B } *
|  |  |--+ DP   { Int32 13341 LZMA_ra(45.9%), 23.9K } *
|  |  |--+ DS   { Bit1 13341 LZMA_ra(5.64%), 101B } *
|  |  |--+ END   { Int32 13341 LZMA_ra(0.30%), 165B } *
|  |  |--+ ExcessHet   { Float32 13341 LZMA_ra(32.5%), 17.0K } *
|  |  |--+ FS   { Float32 13341 LZMA_ra(42.5%), 22.2K } *
|  |  |--+ HaplotypeScore   { Float32 13341 LZMA_ra(0.30%), 165B } *
|  |  |--+ InbreedingCoeff   { Float32 13341 LZMA_ra(33.4%), 17.4K } *
|  |  |--+ MLEAC   { Int32 14055 LZMA_ra(18.2%), 10.0K } *
|  |  |--+ MLEAF   { Float32 14055 LZMA_ra(24.3%), 13.4K } *
|  |  |--+ MQ   { Float32 13341 LZMA_ra(15.2%), 7.9K } *
|  |  |--+ MQRankSum   { Float32 13341 LZMA_ra(13.1%), 6.8K } *
|  |  |--+ NEGATIVE_TRAIN_SITE   { Bit1 13341 LZMA_ra(38.2%), 645B } *
|  |  |--+ POSITIVE_TRAIN_SITE   { Bit1 13341 LZMA_ra(98.9%), 1.6K } *
|  |  |--+ QD   { Float32 13341 LZMA_ra(42.3%), 22.0K } *
|  |  |--+ RAW_MQ   { Float32 13341 LZMA_ra(0.30%), 165B } *
|  |  |--+ ReadPosRankSum   { Float32 13341 LZMA_ra(45.7%), 23.8K } *
|  |  |--+ ReverseComplementedAlleles   { Bit1 13341 LZMA_ra(5.88%), 105B } *
|  |  |--+ SOR   { Float32 13341 LZMA_ra(41.3%), 21.5K } *
|  |  |--+ SwappedAlleles   { Bit1 13341 LZMA_ra(5.64%), 101B } *
|  |  |--+ VQSLOD   { Float32 13341 LZMA_ra(46.2%), 24.1K } *
|  |  |--+ culprit   { Str8 13341 LZMA_ra(1.07%), 1.3K } *
|  |  \--+ FunctionalAnnotation   [  ]
|  |     \--+ FAVORannotator   [ data.frame ] *
|  |        |--+ CHR   { Str8 13341 LZMA_ra(0.45%), 189B }
|  |        |--+ POS   { Str8 13341 LZMA_ra(19.7%), 23.2K }
|  |        |--+ REF   { Str8 13341 LZMA_ra(22.0%), 6.5K }
|  |        |--+ ALT   { Str8 13341 LZMA_ra(22.4%), 7.1K }
|  |        |--+ vid   { Float64 13341 LZMA_ra(27.6%), 28.8K }
|  |        |--+ variant_vcf   { Str8 13341 LZMA_ra(19.5%), 39.8K }
|  |        |--+ variant_annovar   { Str8 13341 LZMA_ra(14.8%), 46.8K }
|  |        |--+ start_position   { Str8 13341 LZMA_ra(20.5%), 23.2K }
|  |        |--+ end_position   { Str8 13341 LZMA_ra(20.5%), 23.2K }
|  |        |--+ ref_annovar   { Str8 13341 LZMA_ra(21.8%), 6.0K }
|  |        |--+ alt_annovar   { Str8 13341 LZMA_ra(21.2%), 5.6K }
|  |        |--+ aloft_value   { Str8 13341 LZMA_ra(10.0%), 1.5K }
|  |        |--+ aloft_description   { Str8 13341 LZMA_ra(4.40%), 705B }
|  |        |--+ apc_conservation   { Float64 13341 LZMA_ra(85.9%), 89.6K }
|  |        |--+ apc_conservation_v2   { Float64 13341 LZMA_ra(85.9%), 89.5K }
|  |        |--+ apc_epigenetics_active   { Float64 13341 LZMA_ra(76.1%), 79.4K }
|  |        |--+ apc_epigenetics   { Float64 13341 LZMA_ra(84.6%), 88.2K }
|  |        |--+ apc_epigenetics_repressed   { Float64 13341 LZMA_ra(62.4%), 65.0K }
|  |        |--+ apc_epigenetics_transcription   { Float64 13341 LZMA_ra(66.6%), 69.5K }
|  |        |--+ apc_local_nucleotide_diversity   { Float64 13341 LZMA_ra(9.70%), 10.1K }
|  |        |--+ apc_local_nucleotide_diversity_v2   { Float64 13341 LZMA_ra(82.9%), 86.4K }
|  |        |--+ apc_local_nucleotide_diversity_v3   { Float64 13341 LZMA_ra(83.5%), 87.0K }
|  |        |--+ apc_mappability   { Float64 13341 LZMA_ra(24.2%), 25.3K }
|  |        |--+ apc_micro_rna   { Float64 13341 LZMA_ra(10.2%), 10.7K }
|  |        |--+ apc_mutation_density   { Float64 13341 LZMA_ra(82.9%), 86.4K }
|  |        |--+ apc_protein_function   { Float64 13341 LZMA_ra(18.0%), 18.7K }
|  |        |--+ apc_protein_function_v2   { Float64 13341 LZMA_ra(18.2%), 19.0K }
|  |        |--+ apc_protein_function_v3   { Float64 13341 LZMA_ra(18.1%), 18.9K }
|  |        |--+ apc_proximity_to_coding   { Float64 13341 LZMA_ra(46.7%), 48.7K }
|  |        |--+ apc_proximity_to_coding_v2   { Float64 13341 LZMA_ra(37.6%), 39.2K }
|  |        |--+ apc_proximity_to_tsstes   { Float64 13341 LZMA_ra(81.8%), 85.3K }
|  |        |--+ apc_transcription_factor   { Float64 13341 LZMA_ra(19.9%), 20.7K }
|  |        |--+ bravo_an   { Float64 13341 LZMA_ra(1.67%), 1.7K }
|  |        |--+ bravo_af   { Float64 13341 LZMA_ra(56.3%), 58.6K }
|  |        |--+ filter_status   { Str8 13341 LZMA_ra(4.81%), 3.2K }
|  |        |--+ cage_enhancer   { Str8 13341 LZMA_ra(1.20%), 169B }
|  |        |--+ cage_promoter   { Str8 13341 LZMA_ra(15.8%), 5.1K }
|  |        |--+ cage_tc   { Str8 13341 LZMA_ra(16.8%), 9.9K }
|  |        |--+ clnsig   { Str8 13341 LZMA_ra(6.51%), 2.1K }
|  |        |--+ clnsigincl   { Str8 13341 LZMA_ra(1.39%), 193B }
|  |        |--+ clndn   { Str8 13341 LZMA_ra(8.20%), 5.8K }
|  |        |--+ clndnincl   { Str8 13341 LZMA_ra(1.42%), 197B }
|  |        |--+ clnrevstat   { Str8 13341 LZMA_ra(2.89%), 2.5K }
|  |        |--+ origin   { Str8 13341 LZMA_ra(8.17%), 1.2K }
|  |        |--+ clndisdb   { Str8 13341 LZMA_ra(5.46%), 6.0K }
|  |        |--+ clndisdbincl   { Str8 13341 LZMA_ra(1.59%), 221B }
|  |        |--+ geneinfo   { Str8 13341 LZMA_ra(10.2%), 3.5K }
|  |        |--+ polyphen2_hdiv_score   { Float64 13341 LZMA_ra(5.63%), 5.9K }
|  |        |--+ polyphen2_hvar_score   { Float64 13341 LZMA_ra(5.88%), 6.1K }
|  |        |--+ mutation_taster_score   { Float64 13341 LZMA_ra(4.18%), 4.4K }
|  |        |--+ mutation_assessor_score   { Float64 13341 LZMA_ra(5.67%), 5.9K }
|  |        |--+ metasvm_pred   { Str8 13341 LZMA_ra(11.0%), 1.7K }
|  |        |--+ fathmm_xf   { Float64 13341 LZMA_ra(81.7%), 85.1K }
|  |        |--+ funseq_value   { Str8 13341 LZMA_ra(18.2%), 3.1K }
|  |        |--+ funseq_description   { Str8 13341 LZMA_ra(3.73%), 4.0K }
|  |        |--+ genecode_comprehensive_category   { Str8 13341 LZMA_ra(4.74%), 5.2K }
|  |        |--+ genecode_comprehensive_info   { Str8 13341 LZMA_ra(12.8%), 23.4K }
|  |        |--+ genecode_comprehensive_exonic_category   { Str8 13341 LZMA_ra(5.14%), 4.0K }
|  |        |--+ genecode_comprehensive_exonic_info   { Str8 13341 LZMA_ra(13.6%), 75.9K }
|  |        |--+ genehancer   { Str8 13341 LZMA_ra(2.75%), 34.8K }
|  |        |--+ af_total   { Float64 13341 LZMA_ra(79.6%), 82.9K }
|  |        |--+ af_asj_female   { Float64 13341 LZMA_ra(30.2%), 31.5K }
|  |        |--+ af_eas_female   { Float64 13341 LZMA_ra(26.0%), 27.1K }
|  |        |--+ af_afr_male   { Float64 13341 LZMA_ra(59.1%), 61.6K }
|  |        |--+ af_female   { Float64 13341 LZMA_ra(74.8%), 78.0K }
|  |        |--+ af_fin_male   { Float64 13341 LZMA_ra(48.8%), 50.8K }
|  |        |--+ af_oth_female   { Float64 13341 LZMA_ra(31.2%), 32.5K }
|  |        |--+ af_ami   { Float64 13341 LZMA_ra(22.9%), 23.9K }
|  |        |--+ af_oth   { Float64 13341 LZMA_ra(38.9%), 40.6K }
|  |        |--+ af_male   { Float64 13341 LZMA_ra(75.1%), 78.3K }
|  |        |--+ af_ami_female   { Float64 13341 LZMA_ra(19.1%), 20.0K }
|  |        |--+ af_afr   { Float64 13341 LZMA_ra(66.2%), 69.0K }
|  |        |--+ af_eas_male   { Float64 13341 LZMA_ra(27.3%), 28.5K }
|  |        |--+ af_sas   { Float64 13341 LZMA_ra(41.5%), 43.2K }
|  |        |--+ af_nfe_female   { Float64 13341 LZMA_ra(64.2%), 67.0K }
|  |        |--+ af_asj_male   { Float64 13341 LZMA_ra(29.6%), 30.9K }
|  |        |--+ af_raw   { Float64 13341 LZMA_ra(75.7%), 78.9K }
|  |        |--+ af_oth_male   { Float64 13341 LZMA_ra(31.4%), 32.8K }
|  |        |--+ af_nfe_male   { Float64 13341 LZMA_ra(62.3%), 64.9K }
|  |        |--+ af_asj   { Float64 13341 LZMA_ra(37.1%), 38.6K }
|  |        |--+ af_amr_male   { Float64 13341 LZMA_ra(51.5%), 53.7K }
|  |        |--+ af_amr_female   { Float64 13341 LZMA_ra(49.0%), 51.0K }
|  |        |--+ af_sas_female   { Float64 13341 LZMA_ra(24.0%), 25.0K }
|  |        |--+ af_fin   { Float64 13341 LZMA_ra(50.9%), 53.1K }
|  |        |--+ af_afr_female   { Float64 13341 LZMA_ra(60.5%), 63.1K }
|  |        |--+ af_sas_male   { Float64 13341 LZMA_ra(39.2%), 40.8K }
|  |        |--+ af_amr   { Float64 13341 LZMA_ra(57.9%), 60.3K }
|  |        |--+ af_nfe   { Float64 13341 LZMA_ra(69.6%), 72.6K }
|  |        |--+ af_eas   { Float64 13341 LZMA_ra(31.7%), 33.1K }
|  |        |--+ af_ami_male   { Float64 13341 LZMA_ra(18.6%), 19.4K }
|  |        |--+ af_fin_female   { Float64 13341 LZMA_ra(34.9%), 36.4K }
|  |        |--+ linsight   { Float64 13341 LZMA_ra(39.9%), 41.6K }
|  |        |--+ gc   { Float64 13341 LZMA_ra(12.4%), 13.0K }
|  |        |--+ cpg   { Float64 13341 LZMA_ra(7.96%), 8.3K }
|  |        |--+ min_dist_tss   { Float64 13341 LZMA_ra(23.8%), 24.8K }
|  |        |--+ min_dist_tse   { Float64 13341 LZMA_ra(23.2%), 24.2K }
|  |        |--+ sift_cat   { Str8 13341 LZMA_ra(6.30%), 2.1K }
|  |        |--+ sift_val   { Float64 13341 LZMA_ra(4.44%), 4.6K }
|  |        |--+ polyphen_cat   { Str8 13341 LZMA_ra(6.91%), 2.3K }
|  |        |--+ polyphen_val   { Float64 13341 LZMA_ra(5.98%), 6.2K }
|  |        |--+ priphcons   { Float64 13341 LZMA_ra(17.3%), 18.0K }
|  |        |--+ mamphcons   { Float64 13341 LZMA_ra(11.5%), 12.0K }
|  |        |--+ verphcons   { Float64 13341 LZMA_ra(10.3%), 10.8K }
|  |        |--+ priphylop   { Float64 13341 LZMA_ra(17.5%), 18.3K }
|  |        |--+ mamphylop   { Float64 13341 LZMA_ra(24.7%), 25.7K }
|  |        |--+ verphylop   { Float64 13341 LZMA_ra(25.9%), 27.0K }
|  |        |--+ bstatistic   { Float64 13341 LZMA_ra(4.94%), 5.2K }
|  |        |--+ chmm_e1   { Float64 13341 LZMA_ra(2.29%), 2.4K }
|  |        |--+ chmm_e2   { Float64 13341 LZMA_ra(2.07%), 2.2K }
|  |        |--+ chmm_e3   { Float64 13341 LZMA_ra(2.45%), 2.6K }
|  |        |--+ chmm_e4   { Float64 13341 LZMA_ra(3.41%), 3.6K }
|  |        |--+ chmm_e5   { Float64 13341 LZMA_ra(2.23%), 2.3K }
|  |        |--+ chmm_e6   { Float64 13341 LZMA_ra(3.09%), 3.2K }
|  |        |--+ chmm_e7   { Float64 13341 LZMA_ra(5.49%), 5.7K }
|  |        |--+ chmm_e8   { Float64 13341 LZMA_ra(5.15%), 5.4K }
|  |        |--+ chmm_e9   { Float64 13341 LZMA_ra(3.03%), 3.2K }
|  |        |--+ chmm_e10   { Float64 13341 LZMA_ra(3.51%), 3.7K }
|  |        |--+ chmm_e11   { Float64 13341 LZMA_ra(3.54%), 3.7K }
|  |        |--+ chmm_e12   { Float64 13341 LZMA_ra(3.65%), 3.8K }
|  |        |--+ chmm_e13   { Float64 13341 LZMA_ra(2.63%), 2.7K }
|  |        |--+ chmm_e14   { Float64 13341 LZMA_ra(2.98%), 3.1K }
|  |        |--+ chmm_e15   { Float64 13341 LZMA_ra(5.83%), 6.1K }
|  |        |--+ chmm_e16   { Float64 13341 LZMA_ra(2.51%), 2.6K }
|  |        |--+ chmm_e17   { Float64 13341 LZMA_ra(2.65%), 2.8K }
|  |        |--+ chmm_e18   { Float64 13341 LZMA_ra(2.60%), 2.7K }
|  |        |--+ chmm_e19   { Float64 13341 LZMA_ra(2.84%), 3.0K }
|  |        |--+ chmm_e20   { Float64 13341 LZMA_ra(2.48%), 2.6K }
|  |        |--+ chmm_e21   { Float64 13341 LZMA_ra(3.88%), 4.1K }
|  |        |--+ chmm_e22   { Float64 13341 LZMA_ra(3.39%), 3.5K }
|  |        |--+ chmm_e23   { Float64 13341 LZMA_ra(3.06%), 3.2K }
|  |        |--+ chmm_e24   { Float64 13341 LZMA_ra(3.41%), 3.6K }
|  |        |--+ chmm_e25   { Float64 13341 LZMA_ra(2.20%), 2.3K }
|  |        |--+ gerp_rs   { Float64 13341 LZMA_ra(10.3%), 10.8K }
|  |        |--+ gerp_rs_pval   { Float64 13341 LZMA_ra(17.6%), 18.4K }
|  |        |--+ gerp_n   { Float64 13341 LZMA_ra(18.6%), 19.4K }
|  |        |--+ gerp_s   { Float64 13341 LZMA_ra(23.8%), 24.8K }
|  |        |--+ encodeh3k4me1_sum   { Float64 13341 LZMA_ra(20.3%), 21.1K }
|  |        |--+ encodeh3k4me2_sum   { Float64 13341 LZMA_ra(20.6%), 21.5K }
|  |        |--+ encodeh3k4me3_sum   { Float64 13341 LZMA_ra(20.1%), 21.0K }
|  |        |--+ encodeh3k9ac_sum   { Float64 13341 LZMA_ra(20.2%), 21.1K }
|  |        |--+ encodeh3k9me3_sum   { Float64 13341 LZMA_ra(18.3%), 19.0K }
|  |        |--+ encodeh3k27ac_sum   { Float64 13341 LZMA_ra(20.6%), 21.5K }
|  |        |--+ encodeh3k27me3_sum   { Float64 13341 LZMA_ra(19.8%), 20.6K }
|  |        |--+ encodeh3k36me3_sum   { Float64 13341 LZMA_ra(21.7%), 22.6K }
|  |        |--+ encodeh3k79me2_sum   { Float64 13341 LZMA_ra(20.5%), 21.3K }
|  |        |--+ encodeh4k20me1_sum   { Float64 13341 LZMA_ra(20.1%), 20.9K }
|  |        |--+ encodeh2afz_sum   { Float64 13341 LZMA_ra(20.0%), 20.8K }
|  |        |--+ encode_dnase_sum   { Float64 13341 LZMA_ra(14.1%), 14.7K }
|  |        |--+ encodetotal_rna_sum   { Float64 13341 LZMA_ra(14.9%), 15.5K }
|  |        |--+ grantham   { Float64 13341 LZMA_ra(4.23%), 4.4K }
|  |        |--+ freq100bp   { Float64 13341 LZMA_ra(3.54%), 3.7K }
|  |        |--+ rare100bp   { Float64 13341 LZMA_ra(4.67%), 4.9K }
|  |        |--+ sngl100bp   { Float64 13341 LZMA_ra(8.27%), 8.6K }
|  |        |--+ freq1000bp   { Float64 13341 LZMA_ra(5.01%), 5.2K }
|  |        |--+ rare1000bp   { Float64 13341 LZMA_ra(6.19%), 6.5K }
|  |        |--+ sngl1000bp   { Float64 13341 LZMA_ra(11.9%), 12.4K }
|  |        |--+ freq10000bp   { Float64 13341 LZMA_ra(6.41%), 6.7K }
|  |        |--+ rare10000bp   { Float64 13341 LZMA_ra(7.97%), 8.3K }
|  |        |--+ sngl10000bp   { Float64 13341 LZMA_ra(13.0%), 13.6K }
|  |        |--+ remap_overlap_tf   { Float64 13341 LZMA_ra(8.31%), 8.7K }
|  |        |--+ remap_overlap_cl   { Float64 13341 LZMA_ra(8.97%), 9.4K }
|  |        |--+ cadd_rawscore   { Float64 13341 LZMA_ra(68.9%), 71.8K }
|  |        |--+ cadd_phred   { Float64 13341 LZMA_ra(26.8%), 28.0K }
|  |        |--+ k24_bismap   { Float64 13341 LZMA_ra(8.56%), 8.9K }
|  |        |--+ k24_umap   { Float64 13341 LZMA_ra(3.47%), 3.6K }
|  |        |--+ k36_bismap   { Float64 13341 LZMA_ra(4.00%), 4.2K }
|  |        |--+ k36_umap   { Float64 13341 LZMA_ra(3.30%), 3.5K }
|  |        |--+ k50_bismap   { Float64 13341 LZMA_ra(3.38%), 3.5K }
|  |        |--+ k50_umap   { Float64 13341 LZMA_ra(3.13%), 3.3K }
|  |        |--+ k100_bismap   { Float64 13341 LZMA_ra(3.24%), 3.4K }
|  |        |--+ k100_umap   { Float64 13341 LZMA_ra(3.00%), 3.1K }
|  |        |--+ nucdiv   { Float64 13341 LZMA_ra(8.25%), 8.6K }
|  |        |--+ rdhs   { Str8 13341 LZMA_ra(9.06%), 5.7K }
|  |        |--+ recombination_rate   { Float64 13341 LZMA_ra(20.1%), 21.0K }
|  |        |--+ refseq_category   { Str8 13341 LZMA_ra(0.97%), 137B }
|  |        |--+ refseq_info   { Str8 13341 LZMA_ra(0.97%), 137B }
|  |        |--+ refseq_exonic_category   { Str8 13341 LZMA_ra(5.13%), 3.7K }
|  |        |--+ refseq_exonic_info   { Str8 13341 LZMA_ra(15.0%), 60.6K }
|  |        |--+ super_enhancer   { Str8 13341 LZMA_ra(3.38%), 8.6K }
|  |        |--+ tg_afr   { Float64 13341 LZMA_ra(14.0%), 14.6K }
|  |        |--+ tg_all   { Float64 13341 LZMA_ra(21.2%), 22.1K }
|  |        |--+ tg_amr   { Float64 13341 LZMA_ra(13.0%), 13.6K }
|  |        |--+ tg_eas   { Float64 13341 LZMA_ra(10.0%), 10.5K }
|  |        |--+ tg_eur   { Float64 13341 LZMA_ra(13.8%), 14.4K }
|  |        |--+ tg_sas   { Float64 13341 LZMA_ra(13.0%), 13.5K }
|  |        |--+ ucsc_category   { Str8 13341 LZMA_ra(4.49%), 5.9K }
|  |        |--+ ucsc_info   { Str8 13341 LZMA_ra(5.71%), 34.1K }
|  |        |--+ ucsc_exonic_category   { Str8 13341 LZMA_ra(5.14%), 4.0K }
|  |        \--+ ucsc_exonic_info   { Str8 13341 LZMA_ra(11.0%), 76.0K }
|  \--+ format   [  ]
|     |--+ AD   [  ] *
|     |  \--+ data   { VL_Int 178x27396 LZMA_ra(31.0%), 1.7M } *
|     |--+ DP   [  ] *
|     |  \--+ data   { VL_Int 178x13341 LZMA_ra(41.7%), 1.2M } *
|     |--+ GQ   [  ] *
|     |  \--+ data   { VL_Int 178x13341 LZMA_ra(8.35%), 371.1K } *
|     |--+ MIN_DP   [  ] *
|     |  \--+ data   { VL_Int 178x0 LZMA_ra, 18B } *
|     |--+ PGT   [  ] *
|     |  \--+ data   { Str8 178x3083 LZMA_ra(3.98%), 27.3K } *
|     |--+ PID   [  ] *
|     |  \--+ data   { Str8 178x3083 LZMA_ra(3.91%), 45.8K } *
|     |--+ PL   [  ] *
|     |  \--+ data   { VL_Int 178x42637 LZMA_ra(31.0%), 3.7M } *
|     |--+ RGQ   [  ] *
|     |  \--+ data   { VL_Int 178x0 LZMA_ra, 48B } *
|     \--+ SB   [  ] *
|        \--+ data   { VL_Int 178x0 LZMA_ra, 131B } *
\--+ sample.annotation   [  ]
xihaoli commented 1 year ago

Hi @alohasiqi,

Thanks for letting me know. I can confirm that these messages are not errors from your scripts, but they serve as indications of certain sliding windows that do not have at least 2 rare variants to form a variant set. Please feel free to ignore these messages.

In terms of your annotated GDS file, it seems that you were using the FAVOR Full Database to annotate the GDS file. However, it is recommended use the FAVOR Essential Database to annotate the GDS file in Step 2 of FAVORannotator.

In addition, it seems that some of the dimensions were not matched. For example, the genotype field in your GDS file indicates there are 13,473 variants in your data, however the position field indicates there are 13,341 variants in your data. These discrepancies should be fixed before running FAVORannotator.

Best, Xihao