Closed CesarLeblanc closed 1 year ago
@CesarLeblanc Yes certainly, please feel free to open a PR
@Optimox Is it fine if I open a PR for both this Issue (#493) and the one I made two days ago about working with sparse data (#492)?
@CesarLeblanc please open two separate PRs otherwise things are going to get messy
@Optimox you are right. I'm fixing this issue first (see PR #494), and once it is merged I will open a new PR to allow working with sparse data.
closed by #494
Feature request
After talking with colleagues that are also using TabNet and reading some precedent issues on this matter, it seems that the computation of feature importance is not always needed by users (e.g., when fine-tuning a model and looking for the best hyper-parameters to use, hundreds of combinations can be tried, and the computation of feature importance is not required for all those models but only once the best model is defined). Moreover, the computation of feature importance is very time-consuming (sometimes even longer than the training of the neural network itself, especially when the input data is high-dimensional). Therefore, I wish to add a compute_importance parameter to the fit method of the TabModel class. This parameter would be a boolean value that allows users to activate or deactivate the computation of feature importance during training. As the interpretability aspect of this model is very interesting, I believe it would be beneficial to have this parameter set to True by default.
What is the expected behavior?
When the compute_importance parameter is set to True, the fit method will compute the feature importance along with the training process. This behavior is consistent with the current behavior of the fit method.
When the compute_importance parameter is set to False, the fit method will skip the computation of feature importance entirely. This will significantly reduce the training time, especially in scenarios where the feature importance is not immediately required.
What is motivation or use case for adding/changing the behavior?
The motivation behind adding the compute_importance parameter is to provide users with more flexibility and control over the training process of TabNet. Currently, the computation of feature importance is performed for every training iteration, which can be unnecessary and time-consuming in certain scenarios.
By allowing users to disable the computation of feature importance during training, they can save significant computation time when performing hyper-parameter search or training multiple models. This is particularly valuable in situations where the feature importance is only needed once the best model has been determined.
How should this be implemented in your opinion?
To implement this feature, the following steps can be taken:
Here's the new fit method:
Are you willing to work on this yourself? Yes, I am willing to work on implementing this feature.