studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings
Apache License 2.0
705 stars 101 forks source link

Results reported in the paper? #135

Closed lshowway closed 2 years ago

lshowway commented 2 years ago

Thanks for your solid work.

Are the results reported in the paper based on allennlp or huggingface?

I finetune LUKE on OpenEntity and TACRED dataset, with commands and code in legacy, but reduce the max_seq_length from 512 to 256 to avoid OOM. I repeat it more than five times with different seeds, and I got the average results of 77.6and 71.7, respectively, while the reported results are 78.2 and 72.7.

Is this normal?

ryokan0123 commented 2 years ago

I think that your result is within a reasonable range of performance fluctuation.

Although you changed max_seq_length from 512 to 256, that does not make a large difference with these datasets, where typical input sequence lengths are not so long.

Note that the paper reports the best performance of the model. So if you take the average of the scores, it would be definitely lower than the highest score.