Closed qidi-yang closed 2 years ago
Hi!
https://sidhomj.github.io/DeepTCR/api/#DeepTCR.DeepTCR.DeepTCR_WF.Monte_Carlo_CrossVal
Thanks a lot for the detailed reply! Those are very helpful! A quick additional question -- in terms of hyperparameter tuning, did you do a grid search, or is there any automated hyperparameter tuning package you would recommend?
I haven't done extensive grid searches in the past. In general, these models tend to fit pretty well with minimum adjustment. However, to optimize performance, thought usually has to be given to the biological problem and let that guide the parameters. In general though, running these models "out of the box" should usually tell you if there is a signal or not. I really designed DeepTCR not to be a predictive tool but more use predictive power to reveal biological insight. Maximizing the AUC a few points here or there I don't think is usually worthwhile endeavor because it usually doesn't change the biological conclusion/insight. However, you should feel free to look at the code in this repository as well as in the DeepTCR COVID repository (https://github.com/sidhomj/DeepTCR_COVID19) to get a sense for how models can be trained.
Hi,
I have a few questions regarding using GPU.
Also, I was wondering if you could provide some insights on training an imbalanced dataset (binary classification) for this algorithm. Would you suggest using a balanced training dataset or including as much data as possible?
Thanks for your time and help!