princeton-nlp / PURE

[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812
MIT License
790 stars 122 forks source link

Questions about package #24

Closed AlanQuille closed 3 years ago

AlanQuille commented 3 years ago

Hello, I have a few questions about the package:

1.) Is your code is integrated in spacy pipeline or is different from it? 2.) What size of a dataset do we need to make good working model? 3.) What should the configuration of system be to train the model with the data from step 2?

Thanks again for all the help.

a3616001 commented 3 years ago

Hi Alan,

1) Our code is not integrated in spacy pipeline. 2) A larger dataset will typically results in a better model. I would suggest to have at least 1000 training sentences or so to train the model (same order of magnitude to the SciERC dataset size). 3) (I am not sure if I get your question correctly.) Most of our experiments are run in a single 2080 Ti GPU, with lr=2e-5 and batch_size=32.

Hope this helps!

Thanks!

AlanQuille commented 3 years ago

That helps tremendously, thank you very much.