dreamquark-ai / tabnet

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
https://dreamquark-ai.github.io/tabnet/
MIT License
2.61k stars 485 forks source link

Need Help Regarding TabNetEncoder output and TabNetPretraining #446

Closed Hazqeel09 closed 1 year ago

Hazqeel09 commented 1 year ago

Hye, Optimox. As you answered previously, I can use TabNetEncoder to produce custom sizes of embedding. Upon testing the layer, I realised it produced three outputs, I dont know which one is more important. image

The paper that I tried to copy also used unsupervised training. TabNetPretraining layer also produced three different outputs. I checked the outputs' meaning on GitHub but I also doesnt know which one is important and how to use that to pass to TabNetEncoder.

The paper that I'm trying to duplicate is here (https://ieeexplore.ieee.org/document/9658729). Basically, the methods that I need to duplicate from the paper are: image

I already managed the transformer part, but I'm still stuck at the TabNet part.

Lastly, what is the difference between forward and forward mask?

Thank you.

Optimox commented 1 year ago

Hello,

The encoder outputs two things, the different steps outputs and an auxiliary value useful for computting the loss https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/tab_network.py#L188

You can see here how the auxiliary value for the loss is used : https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/abstract_model.py#L509

The forward_mask simply outputs the mask instead of the loss auxiliary so that it can be used for inference to get feature importances: see the explain method here https://github.com/dreamquark-ai/tabnet/blob/bcae5f43b89fb2c53a0fe8be7c218a7b91afac96/pytorch_tabnet/abstract_model.py#L303

What you can do:

Hazqeel09 commented 1 year ago

image

I think I managed to get the model to learn and provide output but can you help to check if I'm doing it correctly? Thank you

Optimox commented 1 year ago

It does not make sense to load a state dict at each forward step, the first two lines should probably be inside the init method.

I don't think it's a good idea to reshape the output of tabnet's encoder to a 8x8 image. What kind of model is your NetRelu ?

You should probably use flattened representations of both model and concatenate them before passing them to a simple Linear head.

Hazqeel09 commented 1 year ago

I see your point, will move the code to be inside init method.

I reshape the output because the architecture I'm trying to copy reshape tabnet output to 1x8x8 and bert sentence embedding to 12x8x8

image This is my NetRelu that also imitates the paper