studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings
Apache License 2.0
705 stars 102 forks source link

NER Finetuning Duration #145

Closed taghreed34 closed 2 years ago

taghreed34 commented 2 years ago

As reported in the paper (the training duration of CoNLL-2003 NER was 203 minutes using a single V100 GPU (Table 11)). Was this training time for luke_large or luke_base?

And to what extent does using mixed precision training based on APEX library affect training time?

ikuyamada commented 2 years ago

Hi @taghreed34, We used luke-large in our experiments, so the training time corresponds to the large model. It depends on the computational environment, but the mixed precision training based on the APEX library (O2 mode) generally reduces the training time significantly.