tatp22 / linformer-pytorch

My take on a practical implementation of Linformer for Pytorch.
https://arxiv.org/pdf/2006.04768.pdf
MIT License
400 stars 36 forks source link

Would you like to release the pretrain tutorial? #4

Closed RyanHuangNLP closed 3 years ago

RyanHuangNLP commented 4 years ago

Do you have any plan to release pretrain pipeline about linformer?

tatp22 commented 4 years ago

Sure. I can show a dummy example of how to pretrain the linformer. However, I don't know what data to pretrain it on, so I will just add something with dummy data in the examples folder. Take a look later this week, and you can simply change the data with your own in order to pretrain to whatever you'd like.

This shouldn't be different than training any other pytorch model, I think.

tatp22 commented 4 years ago

Hey, I added the pretraining tutorial over here: https://github.com/tatp22/linformer-pytorch/blob/master/examples/pretrain_tutorial.py

The only thing is, I'm not sure exactly what data you would like to put in here, so for now, I just put in dummy data to showcase a proof of concept. However, this showcases a potential way to pretrain the linformer, as long as you plug in data of your own.

Run it by running python pretrain_tutorial.py with all of the deps installed.

phongnhhn92 commented 4 years ago

Hi, I think you should use the dataset on this link to test your model. They are quite small and easy to train. https://pytorch.org/tutorials/beginner/transformer_tutorial.html

tatp22 commented 4 years ago

I feel for these language models, we should have another wrapper module, similar to here: https://github.com/lucidrains/sinkhorn-transformer/blob/73da02958965e1a690cb301292c0a3c549687d44/sinkhorn_transformer/sinkhorn_transformer.py#L771

This wrapper module takes in one of these inputs, and then does some preprocessing before feeding it into a linformer. The reason why I say that is because I took a look at the inputs, and they look to be ints rather than "nice" values close to 0, and it seems like something like this class can deal with inputs like these.

tatp22 commented 4 years ago

Hey, I started work on a LinformerLM class. Right now, only the class is implemented, but I will have a tutorial on how to run the validation task on it in a bit.

https://github.com/tatp22/linformer-pytorch/blob/05a6d0fecabbef8160d9165b187db0253fa48d90/linformer_pytorch/linformer_pytorch.py#L235

tatp22 commented 4 years ago

Here's a pretraining tutorial: https://github.com/tatp22/linformer-pytorch/blob/master/examples/pretrain_tutorial_lm.py @RyanHuangNLP @phongnhhn92

I didn't finish training it, but it should be the complete training pipeline for an lm. Let me know what you think!

RyanHuangNLP commented 4 years ago

@tatp22 Great! I will take a try~~

phongnhhn92 commented 4 years ago

Hey @tatp22, I saw that you just fix a bug on your implementation. Did you try to train it with the new version ? I am about to train but was hesitant when I see you have a bug.

tatp22 commented 4 years ago

Yes, I trained it with the new version, and it seems fixed to the best of my knowledge :+1:

twangnh commented 4 years ago

hi @tatp22 , just a quick question, have you verify linformer's performance on any benchmark?

tatp22 commented 4 years ago

See #13