JuliaText / TextAnalysis.jl

Julia package for text analysis
Other
374 stars 96 forks source link

[WIP] Albert #203

Closed tejasvaidhyadev closed 11 months ago

tejasvaidhyadev commented 4 years ago

Hi everyone I am adding ALBERT [WIP] Currently only raw code is given in PR. Dependencies - Transformers.jl , WordTokenizer.jl

I am not exporting any function.I am still in middle of deciding what is the best way to use it. But i am adding some important codes which is used for conversion of pretrained checkpoints and in Demo file below

Roadmap

Important links

Pretrained weights link .

For detail refer - link

Demo - link

PS All the suggestions are welcome

tejasvaidhyadev commented 4 years ago

Sorry for closing PR before Commit history of git is now updated

News

Updated Demo

tejasvaidhyadev commented 4 years ago

Pretrained weights

Version 2 of ALBERT converted Bson is released It doesn't contain 30k-clean.model file (by sentencepiece)

tejasvaidhyadev commented 4 years ago

@aviks any suggestion on the roadmap mentioned above. i am also thinking of adding Tutorial folder (containing ipynb of tutorials)

tejasvaidhyadev commented 4 years ago

added Sentencepiece unigram support

tejasvaidhyadev commented 4 years ago

completed trainable Albert structure.

tejasvaidhyadev commented 4 years ago

fine-tuning Training Tutorial (it's not supported GPU so far)- here

tejasvaidhyadev commented 4 years ago

The above code is pretty messy and not yet refractor (for the experiment) we can drop Sentencepiece as soon as PR of ALBERT is merged Apart from that pretrain.jl is ready and can drop tfck2bsonforalbert.jl in next push I will refractor code within next 1 week

aviks commented 3 years ago

Hi @tejasvaidhyadev can you move this PR to TextModels now please?

tejasvaidhyadev commented 3 years ago

Hi @tejasvaidhyadev can you move this PR to TextModels now please? Hi @aviks,

Is it okay, if I will do it the coming weekend? I have exams this week

aviks commented 3 years ago

I will do it the coming weekend?

Yes, of course, only whenever you have time.