Closed anlijuncn closed 4 years ago
ComBat was not designed for this purpose, as there is a number of scenarios which could be problematic:
Extending and making Combat robust to such cases requires substantial work. Currently, ComBat is limited in harmonizing observational studies, and does not attempt to predict scanner effects on unseen scans.
I've tried this before and found that combat essentially overfits to the training set and when applied to the test set actually induces scanner differences that were not there. I trained a multinomial logistic regression on scanner in the training set, then ran combat on the training set, applied the weights to the test set and evaluate the classifiers performance in identifying scanner after having theoretically removed the scanner related variance with combat. My classifier ended up getting performance significantly below chance. My interpretation was that combat was creating features in the test set that were anticorrelated with scanner in the training set, resulting in below chance performance. @Jfortin1 this was a while ago, so my explanation is handwavy, but does that sound like something that could happen.
The solution I ended up with was to just apply combat separately to the training and test set, which is what I did here: https://www.biorxiv.org/content/10.1101/309260v1
If you want to try it for yourself, here's the monkey patched R-code I used to output weights from a training set: https://github.com/nih-fmrif/nielson_abcd_2018/blob/9ad719bcdcacdd3b4580d6f1a12398138b6a3c0c/swarm_dir/run_abcd_perm_new_draws.py#L35-L305
It be great to find out if someone else got this result as well.
Thanks for your explanation @Jfortin1 and @Shotgunosine . I am trying to develop new harmonization model, for comparision purpose, I would like to split train/test. @Shotgunosine your solution seems helpful!
Hi @anlijuncn , you may find my confounds library and this example useful in achieving your objective: https://raamana.github.io/confounds/usage.html
Happy to chat with you if you think that would help.
Hi, I am wondering whether we could Fit ComBat on training set and apply fitted ComBat on test set? If code supports, how could I save fitted parameters of ComBat
Thanks for your help!