codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.11k stars 1.29k forks source link

Pretrained model transfer to pytorch #34

Open codertimo opened 5 years ago

codertimo commented 5 years ago

Well all of you guys know, it's nearly impossible to train from the scratch, because of lack of computation power. So I'm going to implement the transfer code for making pretrained model can be supported on pytorch too.

This implementation will be started when the Google release their official BERT code and pretrained model. If anyone interested to join this work, please leave the comment underside.

Thank you everyone who carefully watching this project👍 By Junseong Kim

codertimo commented 5 years ago

This issue is stated from #3

threefoldo commented 5 years ago

I would like to join, even though I'm not sure how much I can do.

The training procedure of current implementation is smooth. I finished training on 10K pairs of sentences within 30 minutes, the final loss is 7.73.

briandw commented 5 years ago

Google has released the source and pre-trained models. https://github.com/google-research/bert

Although they claim that you need a TPU to train the base model. "Includes scripts to reproduce results. BERT-Base can be fine-tuned on a standard GPU; for BERT-Large, a Cloud TPU is required (as max batch size for 12-16 GB is too small)."

ZhaoyueCheng commented 5 years ago

I believe fine tuning can be done on a multi GPU system with accumulating gradients in PyTorch.

briandw commented 5 years ago

I didn't get this done quickly enough apparently. Here is the pre-trained model in PyTorch that the HuggingFace team did. https://github.com/huggingface/pytorch-pretrained-BERT

ChawDoe commented 4 years ago

Is the issue solved? Please tell me. I want to use your implementation together with the pretrained model to realise my ideas.

briandw commented 4 years ago

@ChawDoe At this point you should probably look at fast version of Bert from HuggingFace. https://medium.com/huggingface/distilbert-8cf3380435b5

ChawDoe commented 4 years ago

@briandw Thank you.