xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary
GNU General Public License v3.0
21 stars 17 forks source link

Null outputs from Sliding_Window() #45

Closed pelinunal closed 4 months ago

pelinunal commented 5 months ago

Dear @xihaoli,

In my datasets, I have 11201 samples and there are no errors/problems with performing the other STAARpipeline steps, but only with the Sliding_Window step. Each arrayid gives the same warnings continuously and eventually the R crushes.

I tried to re-install each R package or to change the sliding_window_length in various sizes, and the same for the MAF cut-off.

A small example from the errors/warnings:

[1] 1
# of selected samples: 11,021
# of selected variants: 73
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
# of selected samples: 11,021
# of selected variants: 1,307,186
[1] 2
# of selected samples: 11,021
# of selected variants: 46
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
Error in STAAR(G, obj_nullmodel, phred_sub, rare_maf_cutoff = rare_maf_cutoff,  : 
  Number of rare variant in the set is less than 2!
.
.

I would like to inform you that there were no significant results from the Dynamic_Window analysis. Is there any possible reason for the NULL outputs and crushes?

Thank you for your time! Bests, Pelin

xihaoli commented 5 months ago

Hi @pelinunal,

Thank you for your question. This is technically not an error, but rather indicates that your sliding windows do not have a sufficient number of variants to form a valid variant set so that the results will be NULL. With a quick look, could I ask if your data come from a Whole-Genome Sequencing (WGS) study containing variants from both coding and noncoding regions? Your sample size (11,021) is large, but the number of variants (1.3 million) for this chromosome is relatively small under the sample size being considered, which resulted in NULL results.

Best, Xihao

pelinunal commented 5 months ago

Dear @xihaoli,

Indeed, the number of the variants is small in the chr. This is the result of filtering the INFO score >0.9 as a post-QC step after the imputation of the WGS data. Thank you for the enlightment on the problem. I will try with less strict INFO score filtering.

Bests, Pelin