Closed ryshi06 closed 1 month ago
Hi @ryshi06,
Thanks for checking. These messages are not errors from your scripts, but they serve as indications of certain functional categories of genes that do not have at least 2 rare variants to form a variant set. You can proceed normally.
Best, Xihao
Hello Dr. Li,
Thanks for your timely responses! I have another quick question, my aGDS files have more samples than the QC-passed phenotype file, will the program run the analysis based on overlapped samples only?
Best, Ruyu
Hi @ryshi06,
No problem. Typically, one would run the analysis for samples with both phenotype and genotype data available. If you would like to include more samples, then some "missing data" handling would be needed, and this paper could be an option.
Best, Xihao
Thank you!
Sure, I'll close this issue for now, but please feel free to reopen it if you have any other questions.
Hi there, I followed the entire pipeline to generate the aGDS file and fit the null model without a kinship matrix (by assigning kins = NULL) and everything went smoothly. However, when I tried to run the non-coding centric analysis, it gives me the error saying
Error in STAAR(Geno, obj_nullmodel, Anno.Int.PHRED.sub, rare_maf_cutoff = rare_maf_cutoff, : genotype is not a matrix!
. I am not sure why this occurs if all above steps ran smoothly. Could you please help me on this, thanks!`# run model for (arrayid in 1:22){
gene number in job
gene_num_in_array <- 50 group.num.allchr <- ceiling(table(genes_info[,2])/gene_num_in_array) sum(group.num.allchr)
chr <- which.max(arrayid <= cumsum(group.num.allchr)) group.num <- group.num.allchr[chr]
if (chr == 1){ groupid <- arrayid }else{ groupid <- arrayid - cumsum(group.num.allchr)[chr-1] }
genes_info_chr <- genes_info[genes_info[,2]==chr,] sub_seq_num <- dim(genes_info_chr)[1]
if(groupid < group.num) { sub_seq_id <- ((groupid - 1)gene_num_in_array + 1):(groupidgene_num_in_array) }else { sub_seq_id <- ((groupid - 1)*gene_num_in_array + 1):sub_seq_num }
exclude large noncoding masks
jobid_exclude <- c(21,39,44,45,46,53,55,83,88,103,114,127,135,150,154,155,163,164,166,180,189,195,200,233,280,285,295,313,318,319,324,327,363,44,45,54) sub_seq_id_exclude <- c(1009,1929,182,214,270,626,741,894,83,51,611,385,771,493,671,702,238,297,388,352,13,303,600,170,554,207,724,755,1048,319,324,44,411,195,236,677)
for(i in 1:length(jobid_exclude)) { if(arrayid==jobid_exclude[i]) { sub_seq_id <- setdiff(sub_seq_id,sub_seq_id_exclude[i]) } }
aGDS file
agds.path <- agds_dir[chr] genofile <- seqOpen(agds.path)
results_noncoding <- c() for(kk in sub_seq_id) { print(kk) gene_name <- genes_info_chr[kk,1] results <- Gene_Centric_Noncoding(chr=chr,gene_name=gene_name,genofile=genofile,obj_nullmodel=obj_nullmodel, rare_maf_cutoff=0.05,rv_num_cutoff=2, QC_label=QC_label,variant_type=variant_type,geno_missing_imputation=geno_missing_imputation, Annotation_dir=Annotation_dir,Annotation_name_catalog=Annotation_name_catalog, Use_annotation_weights=Use_annotation_weights,Annotation_name=Annotation_name) results_noncoding <- append(results_noncoding,results) }
save(results_noncoding,file=paste0(output_path,output_filename,"",arrayid,".Rdata"))
seqClose(genofile) } `