Need Help Regarding TabNetEncoder output and TabNetPretraining

Hazqeel09 commented 1 year ago

Hye, Optimox. As you answered previously, I can use TabNetEncoder to produce custom sizes of embedding. Upon testing the layer, I realised it produced three outputs, I dont know which one is more important.

The paper that I tried to copy also used unsupervised training. TabNetPretraining layer also produced three different outputs. I checked the outputs' meaning on GitHub but I also doesnt know which one is important and how to use that to pass to TabNetEncoder.

The paper that I'm trying to duplicate is here (https://ieeexplore.ieee.org/document/9658729). Basically, the methods that I need to duplicate from the paper are:

Unsupervised learning using TabNet
Encode the value using TabNet
Get feature importance

I already managed the transformer part, but I'm still stuck at the TabNet part.

Lastly, what is the difference between forward and forward mask?

Thank you.

Optimox commented 1 year ago

Hello,

The encoder outputs two things, the different steps outputs and an auxiliary value useful for computting the loss https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/tab_network.py#L188

You can see here how the auxiliary value for the loss is used : https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/abstract_model.py#L509

The forward_mask simply outputs the mask instead of the loss auxiliary so that it can be used for inference to get feature importances: see the explain method here https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/abstract_model.py#L303

What you can do:

simply use TabNetPretrainer to pretrain your encoder, use the standard library
extract the tabnet encoder weight by accessing the encoder clf.network.tabnet.encoder, create the encoder inside your pipeline and load the weights.
access the feature importance the same way the library do (you'll have to go inside the source code and copy paste some parts)
train your end to end pipeline with your BERT model

Hazqeel09 commented 1 year ago

I think I managed to get the model to learn and provide output but can you help to check if I'm doing it correctly? Thank you

Optimox commented 1 year ago

It does not make sense to load a state dict at each forward step, the first two lines should probably be inside the init method.

I don't think it's a good idea to reshape the output of tabnet's encoder to a 8x8 image. What kind of model is your NetRelu ?

You should probably use flattened representations of both model and concatenate them before passing them to a simple Linear head.

Hazqeel09 commented 1 year ago

I see your point, will move the code to be inside init method.

I reshape the output because the architecture I'm trying to copy reshape tabnet output to 1x8x8 and bert sentence embedding to 12x8x8

This is my NetRelu that also imitates the paper

dreamquark-ai / tabnet

Need Help Regarding TabNetEncoder output and TabNetPretraining #446