tariks / peakachu

Genome-wide contact analysis using sklearn
MIT License
57 stars 9 forks source link

if k != chromname #6

Closed YvesXu closed 4 years ago

YvesXu commented 4 years ago

Hi, I found that the features in Xtrain and Xfake do not contain their own features. Is this to prove that the chromosome loop can be predict from other chromosomes? Or something else?

"train_models.py" line 75 (v for k, v in positive_class.items() if k != chromname))

Thank you!

tariks commented 4 years ago

This is to prevent overfitting. When training and predicting are happening in the same dataset, a separate model is made for each chromosome, and this model excludes that chromosome during the training phase.