kamalkraj / ALBERT-TF2.0

ALBERT model Pretraining and Fine Tuning using TF2.0
Apache License 2.0
199 stars 45 forks source link

Weights #2

Closed peregilk closed 4 years ago

peregilk commented 4 years ago

Could you give some more info about the weights linked here? Trained on English corpus only? As the original article?

You write the last layers is not available. This would probably mean they can not be used for additional domain specific pre-training, right? What would be required for doing this?

kamalkraj commented 4 years ago

Hi @peregilk, Weights in this repo are converted from original checkpoints Google released. English corpus only.

Currently, this repo has support only for finetuning downstream tasks. Repo reproduces the same results as reported in google original repo. Tested.

Finetuning the model on Domain-specific data is not supported now. will be added soon.

Contributions are welcome.

kamalkraj commented 4 years ago

@peregilk Pretraining and Finetuning on domain-specific support added. https://github.com/kamalkraj/ALBERT-TF2.0/blob/master/pretraining.md

peregilk commented 4 years ago

Awesome. Really looking forward to testing this. Thanks a lot.

If anyone else is reading the post, I think the correct link should be: https://github.com/kamalkraj/ALBERT-TF2.0/blob/master/pretraining.md

I am a bit confused about the terminology here. And I might be completely wrong about this. However, in the original Bert-paper it seems like they are calling only the supervised part "fine-tuning", and are referring to this is "additional domain specific pre-training".

kamalkraj commented 4 years ago

You can Fine-Tune pre-trained MLM and SOP model on Domain like Medical or You can pre-train from scratch to Domain-specific Data. Example of BERT pre-trained model, finetuned https://github.com/dmis-lab/biobert

peregilk commented 4 years ago

It was just a question regarding the terminology.

I just noticed that the Bert-page says "If your task has a large domain-specific corpus available (e.g., "movie reviews" or "scientific papers"), it will likely be beneficial to run additional steps of pre-training on your corpus, starting from the BERT checkpoint."

I know the point of doing additional MLM/SOP training on a domain-specific corpus really is to "fine-tune" the weights trained on the general corpus. I guess the reason they are not calling it "fine-tuning" is that this is used for task-specific training.