EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.74k stars 1.57k forks source link

covariate adjustment for metabolomics data #1311

Open arpita-007 opened 1 year ago

arpita-007 commented 1 year ago

Hi,

I haven't used Python before and am completely new to its world. Can you please help me with the covariate adjustment for my features? I have 70 targets and ~500 features which I want to adjust for 3 covariates.

I tried by seeing the examples for resAdj but was not able to replicate it on my data. I am confused about how should I tell my code which is needed to be adjusted for what.

I would be very grateful for any help provided.

Thank you very much!!

arpita-007 commented 1 year ago

For feature covariate adjustments, we have added a transformer (resAdjTransformer) to TPOT. This needs to be either the first step of any pipeline or the second step after an FSS (TPOT Template can be used to specify these). The initial input to TPOT adds the covariate columns to X. One hyperparameter of this transformer is a file specifying which columns of X should be adjusted by which covariate columns. The transformer applies the no leakage residual adjustments to these columns and removes the covariate columns before passing its output on to the other steps. If no covariate adjustment on the target is needed, classic TPOT can then be run as usual (Figure 1 path A).

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7528347/

Here, how can I mention the hyperparameters?? Can you please provide the code for it?