rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
182 stars 53 forks source link

Gene set analysis with custom mask definitions #452

Closed JaehyunParkBiostat closed 11 months ago

JaehyunParkBiostat commented 11 months ago

Hello,

I would like to make an inquiry about an error while running gene set analysis with custom mask definitions.

I have custom definitions of the annotations and want to use them for the gene set analysis (burden test, SKAT, ...). However, when I tried running REGENIE, an error occurred with the warning "WARNING: Detected 6 masks with only unknown annotations (these are ignored)."

Below is the content of the mask file:

Mask3 coding3 Mask4 coding4 Mask5 coding5 Mask7 coding7 Mask8 coding8 Mask9 coding9

Below is the example of the annotation:

chr22:15528810:A:T OR11H1 coding3 chr22:15528812:C:A OR11H1 coding3 chr22:15528853:C:G OR11H1 coding3 chr22:15528888:C:A OR11H1 coding3 chr22:16962747:C:A GAB4 coding3 chr22:16962749:C:A GAB4 coding3

Below is the log file:

  • annotations : [chr1_coding3_mod] +number of annotations categories = 2
  • masks : [mask3.txt] n_masks = 0 WARNING: Detected 6 masks with only unknown annotations (these are ignored). ERROR: no masks are left to be included in the analysis.

I thought this was because I used the custom annotations; I would like to ask whether this is the right cause and how I may solve this. I would appreciate it if you could give me a response.

Thank you.

JaehyunParkBiostat commented 11 months ago

I am sorry but I forgot to mention; I used the version 3.2.4.

joellembatchou commented 11 months ago

Hi,

Try using --check-burden-files to get more detailed information on consistency between annotation and mask file. Also, can you check if there are any special characters in the files (e.g. cat -A mask_file)?

Cheers, Joelle

JaehyunParkBiostat commented 11 months ago

Hello,

I found that the mask_file contained ^M (ctrl+M) at the end of each line, which was hard to find with vi or vim. After removing that, the error disappeared.

I appreciate your help, and I am sorry for the mistake.

Thank you.