msultan / SML_CV

Using supervised machine learning to build collective variables for accelerated sampling
MIT License
27 stars 5 forks source link

Training the tICs instead of the features #6

Open sbhakat opened 5 years ago

sbhakat commented 5 years ago

I generated 4 tICs from the 4 dihedral features presented in the attached notebook. Now I want to train the tICs to see which tIC is more dominant over the other. And in this regard I want to generate a plot similar like SVM co-efficients vs Feature index mentioned in this notebook https://github.com/msultan/SML_CV/blob/master/alanine_example/01-svm_example.ipynb .

First question:

I tried something like that just substituting the original script a bit to train the tICA features


X=np.vstack(plot_feat)
train_X=np.vstack(tica_features)

y=np.concatenate([np.zeros(len(plot_feat[0])),
            np.ones(len(plot_feat[0]))])

if train:
    clf.fit(train_X,y)

train_X.sum(axis=1)[300:].std()

What is the meaning of this line train_X.sum(axis=1)[300:].std() and the output 2.2081068901447987

The I plotted the tIC0 vs SVM co-efficient using the following script and got an output like following

plot([0,1,2,3],clf.coef_[0],color=sns.color_palette("colorblind")[0],marker='o',label=r'$\alpha_L$ vs $\beta$')
xticks([0,1,2,3],[r'tic0',r'tic1',r'tic2',r'tic3'],size=20)
xlabel("tICs Index")
ylabel(r'$SVM_{cv}$ Coefficients')
legend()

image

Simple question: How do I get the exact value of SVM coefficient for each tICs. It is a bit hard to see from the plot. Any script?

msultan commented 5 years ago

Simple question: How do I get the exact value of SVM coefficient for each tICs. It is a bit hard to see from the plot. Any script?

I am confused here. Have you tried printing out the coefficients instead of plotting them?

msultan commented 5 years ago

What is the meaning of this line train_X.sum(axis=1)[300:].std() and the output 2.2081068901447987

I was calculating the standard deviation of each feature after summing them together. I am not sure I remember why exactly tbh.