xihaoli / STAARpipeline-Tutorial

The tutorial for performing single-/multi-trait association analysis of whole-genome/whole-exome sequencing (WGS/WES) studies using FAVORannotator, STAARpipeline and STAARpipelineSummary
GNU General Public License v3.0
21 stars 17 forks source link

"Coding mask" AND "Noncoding mask" #17

Closed DamienTan closed 1 year ago

DamienTan commented 1 year ago

Dear Dr. Li I am confused with the "Coding mask" AND "Noncoding mask" in R script STAARpipelineSummary_Gene_Centric_Coding_Annotation.r and STAARpipelineSummary_Gene_Centric_Noncoding_Annotation.r

## Chr chr_seq <- c(1,7,19,19,9) ## Gene name gene_name_seq <- c("PCSK9","NPC1L1","LDLR","APOE","RNF20") ## Coding mask category_seq <- c("plof","missense","missense","missense","synonymous")

Why did you choose these five genes? If I want to run these annotation steps, should I choose other genes and how to choose these genes? Looking forward to your reply.

xihaoli commented 1 year ago

Hi Damien,

Thanks for your question. These five genes and their categories (masks) are the ones that were significant in our analysis of lipid traits, so we listed them here as an example. For your case, please feel free to change them to your list of gene masks of interest. Note that chr_seq, gene_name_seq, category_seq should have the same length such that in our case, we were interested in querying the plof rare variants of PCSK9 (on chr 1); missense rare variants of NPC1L1 (on chr 7); missense rare variants of LDLR (on chr 19); missense rare variants of APOE (on chr 19); and synonymous rare variants of RNF20 (on chr 9).

Hope this is clear.

Best, Xihao

DamienTan commented 1 year ago

Dear Dr. Li Really apprecciate it! It means that I can select the candidate genes in the previous analysis results and these genes are significant. By the way, should I select the top genes in manhattan plot as the input genes?

xihaoli commented 1 year ago

Hi Damien,

Yes, the STAARpipelineSummary_Gene_Centric_Coding_Annotation.r script is general in the sense that you may use it to query any gene mask. However, in practice, you may select the candidate gene masks that were significant or suggestively significant in your unconditional analysis results, which were essentially the top gene masks in the manhattan plot as your input.

Best, Xihao

DamienTan commented 1 year ago

Thanks again for your generous help, I got your point!

xihaoli commented 1 year ago

You are very welcome, Damien!

Best, Xihao