rvinas / GTEx-imputation

Gene Expression Imputation with Generative Adversarial Imputation Nets
MIT License
11 stars 3 forks source link

covariates including age sex (and cohort)? intuitively, are the these covariates affect the result a lot? #3

Closed yezhengli-Mr9 closed 4 years ago

yezhengli-Mr9 commented 4 years ago

In the paper, "donor’s age as numerical covariate in r and the tissue type, sex and cohort". Just intuitively (I mean I am asking exact numeric numbers), do they affect much? On my side, I use covariates majorly of 5362 SNPs, then with "SEX" "AGE" "DTHHRDY" in GTEx_Analysis_v8_Annotations_SubjectPhenotypesDS.txt.

On my side, even in regression algorithms (let alone In a neural network on my side), the effect is small or unexplainable (well anyway, I just focus on PROTEIN.CODING.GENE on chrome).

rvinas commented 4 years ago

It depends on the covariates. Some covariates such as sex or age are known to be associated with gene expression levels, so I expect that including them in the model will improve the imputation quality.

It would be really interesting to perform an ablation study to see which covariates are actually helping.

yezhengli-Mr9 commented 4 years ago

It depends on the covariates. Some covariates such as sex or age are known to be associated with gene expression levels, so I expect that including them in the model will improve the imputation quality.

It would be really interesting to perform an ablation study to see which covariates are actually helping.

Oh, no need to provide me an ablation study ... (you take 9 hours or even days to run) just in my situation (focusing on regression on SNPs) it does not help a lot. Just to make sure you also do with these covariates simply by adding them into features; I mean nothing else.

Thanks a lot, you response is already helpful enough.