xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary
GNU General Public License v3.0
24 stars 17 forks source link

Questions about individual analysis results #22

Closed alohasiqi closed 1 year ago

alohasiqi commented 1 year ago

Hello,

I have some questions about the output of the individual analysis.

  1. What's "# of selected variants: 23"? Given my input of 14,055 variants, I assume the "selected variants" are not selected by the user-given arguments (e.g., mac_cutoff as I set it to 0) rather indicating the final variants after the individual analysis. So do we have the intermediate results of those unselected variants? Are they stored somewhere and why are they not selected (maybe by some p-values)?

  2. For the error message "(!all(CHR == chr)) ", is it indicating the entire 22 chromosomes are required for the analysis since I only give one chromosome?

> start_loc <- 1
> end_loc <- start_loc + 10e7 - 1
> results_individual_analysis<-Individual_Analysis(chr=22,start_loc=start_loc,end_loc=end_loc,
                                                     genofile=genofile,obj_nullmodel=obj_nullmodel,mac_cutoff=0, variant_type=variant_type,
                                                     geno_missing_imputation=geno_missing_imputation)
# of selected samples: 178
# of selected variants: 23
Error in if (!all(CHR == chr)) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In Individual_Analysis(chr = 22, start_loc = start_loc, end_loc = end_loc,  :
  NAs introduced by coercion

Thanks and I hope I'm not bothering you too much.

xihaoli commented 1 year ago

Hi @alohasiqi,

For your new questions, please feel free to send us via email. We can take a closer look and discuss this in more detail.

Best, Xihao

alohasiqi commented 1 year ago

Thank you so much and SOLVED! In case anyone has the same question, a brief answer to my first question is that those 23 variants are regarded as the final results after running STAARpipeline, meaning they have passed the user-given arguments and the stats significance calculated by STAARpipeline. All the input ~14k variants have been analyzed (not just those 23 variants).

The 2nd error message is due to the alt chromosome contigs, e.g., chr22_KI270879v1_alt. Such cases have to be fixed before making the GDS file, especially when the GDS file is coming from the converted VCF.