leelabsg / SKAT

Sequence kernel association test (SKAT)
41 stars 16 forks source link

Suggest: use example data with named covariate columns #11

Closed richelbilderbeek closed 2 years ago

richelbilderbeek commented 3 years ago

Dear SKAT authors,

From the SKAT::SKAT documentation I could easily do a SKAT analysis, using the provided example data. Below the example code from the package, without using attach (#10):

library(SKAT)
data(SKAT.example)

# Linear null model based on continuous traits
linear_null_model_continuous <- SKAT_Null_Model(
  data = SKAT.example,
  formula = y.c ~ 1,
  out_type = "C" # continuous
)
# p-value: 0.01874576
SKAT(Z = SKAT.example$Z, linear_null_model_continuous)$p.value

However, the example data is needlessly vague about what the covariates in SKAT.example$X are. I use the word needlessly, as this code also works:

library(SKAT)
data(SKAT.example)

# Convert the covariates to a tibble with named columns
t_covariates <- tibble::as_tibble(SKAT.example$X)
names(t_covariates) <- c("is_rocket", "speed")
SKAT.example$X <- t_covariates

# Linear null model based on continuous traits
linear_null_model_continuous <- SKAT_Null_Model(
  data = SKAT.example,
  formula = y.c ~ 1,
  out_type = "C" # continuous
)
# p-value: 0.01874576
SKAT(Z = SKAT.example$Z, linear_null_model_continuous)$p.value

I suggest to provide the example data with usefully named covariate columns. Sure, my column names were just examples. I would be happy to fix this Issue myself by Pull Request :+1:

richelbilderbeek commented 3 years ago

Contacted the SKAT maintainer by email:

Dear SKAT maintainer,

I contact you as I have posted two Issues at the SKAT GitHub repo and am unsure if you have read these.

If you have and just haven't had the time yet, I would understand :-)

Looking forward to a response one day and cheers, Richel Bilderbeek

richelbilderbeek commented 3 years ago

From here I quote:

BTW, adding column name isn't done yet. It will be great if you can do this.

I will do so, if I know what the column names are for the covariates.

@leeshawn: what are the column names are for the covariates? I see it is a binary and a normal trait (i.e. mean zero), such as is_male and relative_something (?height in centimeters?).

richelbilderbeek commented 3 years ago

Will be fixed with Pull Request #14 . See the code how I did so there.

richelbilderbeek commented 2 years ago

@leeshawn the Pull Request has not been accepted yet.

Will it be?