Distilled Luke models and Pretraining of Lite Models

studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings

Apache License 2.0

705 stars 102 forks source link

Distilled Luke models and Pretraining of Lite Models #154

Closed luffycodes closed 2 years ago

luffycodes commented 2 years ago

Hello,

Just curious to know if there are any distilled Luke models as well?

Also, what is the procedure to train the lite models? Does one just not the stage 1 training and skip directly to stage 2?

Thanks in advance,

ryokan0123 commented 2 years ago

Unfortunately, we don't have distilled LUKE models.

Also, what is the procedure to train the lite models? Does one just not the stage 1 training and skip directly to stage 2?

The lite models come from the original models, so they were not newly pretrained. We just took the word encoder and special entity embeddings from the original models and make the lite models. They are just for convenience, having smaller memory footprints when you don't need so many entity embeddings.