dougspeed / LDAK

Other
12 stars 1 forks source link

Error reading {covariates_file_path}; Element 3 of Row 6 is unrecognisable #8

Open sahwa opened 2 weeks ago

sahwa commented 2 weeks ago

Hi Doug,

I am running into an error when LDAK is reading in the covariates file. I've tried changing the file from tab to space separated, removing any non-alphanumeric values from the variable names and the error still persists.

Any idea how I can fix this?

Best,

Sam

[09:56:40] aey472@compa028$ ./ldak6.beta --linear {model_path} --pheno {pheno_path} --bfile {bfile_path} --covar {covariates_path} --sandwich YES --max-threads 8

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
LDAK - Software for obtaining Linkage Disequilibrium Adjusted Kinships and Loads More
Version 6 - Help pages at http://www.ldak.org
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

There are 6 pairs of arguments:
--linear {model_path}
--pheno {pheno_path}
--bfile {bfile_path}
--covar {covariates_path}
--sandwich YES
--max-threads 8

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

Performing linear regression

To additionally compute test statistics via a saddlepoint approximation, add "--spa-test YES"

To perform weighted linear regression, use "--sample-weights"

Will use the sandwich estimator of effect size variance; to switch to the standard estimator use "--sandwich NO"

Consider using "--top-preds" to include (strongly-associated) predictors as extra covariates

Will compute standard test statistics; use "--spa-test YES" to switch to a saddlepoint approximation,

-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

Reading IDs for 11523 samples from {fam_file_path}

Checking responses for 100706 samples from {pheno_file_path}

Reading details for 7797880 predictors from {bim_file_path}

Data contain 11523 samples and 7797880 predictors

Reading phenotypes for 100706 samples from {pheno_file_path}

Reading 3 covariates for 100706 samples from {covariates_file_path}
Error reading {covariates_file_path}; Element 3 of Row 6 is unrecognisable (AxiomCKB1)

Here is the covariate file:

FID     IID     gwas_array_type sex     yob
ID000001        ID000001        AxiomCKB1       1       1941
ID000002        ID000002        AxiomCKB2       1       1964
ID000003        ID000003        AxiomCKB1       1       1935
ID000004        ID000004        AxiomCKB2       1       1972
ID000005        ID000005        AxiomCKB1       1       1958
ID000006        ID000006        AxiomCKB1       0       1947
ID000007        ID000007        AxiomCKB2       1       1936
ID000008        ID000008        AxiomCKB1       0       1961
ID000009        ID000009        AxiomCKB2       1       1953

Best,

Sam

sahwa commented 2 weeks ago

Ah so it seems like the covariates have to be numeric, is that right?

https://github.com/dougspeed/LDAK/blob/a687eab862732b806c45cd1088f55fd224fe1278/source_code/parsefiles.c#L1465

Turning the factors into numeric solved the problem. But is it possible to be able to use categorical covariates?

dougspeed commented 2 weeks ago

Hi, thanks for the question. Yes, that is correct, the covariates provided by --covar must be numeric. I have now added a feature to accept categorical covariates (using the argument --factors) - which will be available in the next release (but until then, yes, you must convert factors to indicator variables)

Thanks