Actis92 / lit-saint

MIT License
12 stars 2 forks source link

Feature importance using SAINT #6

Open isamgul opened 2 years ago

isamgul commented 2 years ago

Hi thank you for the implementation of the SAINT paper. I wanted to understand how I can generate a plot of the important features for the model using the self-attention in SAINT

Actis92 commented 2 years ago

Hi @isamgul I have released a new version in which I have added the possibility to get the feature importance. It works only with the attention of type col or rowcol, because it use the self attention computed between different features. Instead using row you compute self attention between different samples. In the example I show how to obtain feature importance using the predict method of the trainer

isamgul commented 2 years ago

Hi Actis92, Thank you for the feature importance code addition to the repository. Could you please explain logic you have implemented behind getting these feature importance using the self attention layer in the SAINT algorithm?

Actis92 commented 2 years ago

I have noticed that the current implementation of feature importance is wrong, I will try to fix in the next days. Because the idea is to have an additional column, that is used like a token cls so it means that at the end you use the embedding from this column as input of the final mlp network for prediction. In this way it's possible to get attention weights between all the features and the token cls that is used for the prediction

isamgul commented 2 years ago

Thanks @Actis92 for the heads up. I will wait for your update to try it on my use-case. Thanks again for your help on this.

isamgul commented 2 years ago

Hi @Actis92 , I tried running the latest example shared by you in the repository but am running into this error: ValueError: Shape of passed values is (XX, YY), indices imply (XX, YY - 1). I tried reducing the shape of the test dataset also but I get the same error. I am aware you are working on fixing this. Any idea by when you can have a solution for this? Thanks again for your help on this issue.

Actis92 commented 2 years ago

Hi @isamgul I can work on it only during the weekends, so I hope for Monday. In any case I don't understand where you get the error, because I have tried the example locally and it seemed to work. Maybe you can give me more details

isamgul commented 2 years ago

Hi @Actis92,

The issue got fixed. it was happening because my dataset did not have any categorical features which causing the issue in the feature importance calculation function. I Will look forward to your feature importance code fix by Monday next week! Thanks again for your help and prompt responses!

Actis92 commented 2 years ago

Hi @isamgul I have pushed a new version where I have added the additional column that is used in order to get the attention of each feature respect to this column

isamgul commented 2 years ago

Hi @Actis92,

Thank you for this update, ill try it out on my dataset. Could you please explain how the latest feature importance code works and the math behind it?