studio-ousia / luke

LUKE -- Language Understanding with Knowledge-based Embeddings
Apache License 2.0
705 stars 101 forks source link

LUKE on custom datasets for NER #144

Closed Dimiftb closed 2 years ago

Dimiftb commented 2 years ago

Hi there

I'm wondering if you're planning to release instructions on how to fine-tune LUKE on custom datasets such as SciERC with different entity types?

Thanks.

ryokan0123 commented 2 years ago

Hi,

Although we cannot prepare instructions for each dataset, we have some recommendations for fine-tuning with custom datasets.

As our code is based on more generic NLP libraries such as allennlp or transformers, probably it would be the easiest to modify their example code to achieve your goal.

For example, if you want to experiment with the SciERC dataset and decided to use our allennlp code, all you have to do is modifying the _read() function here. You can also refer to the tutorial or guide of allennlp as necessary.

Hope this helps : )