Training without entity type token

JohnGiorgi / seq2rel

The corresponding code for our paper: A sequence-to-sequence approach for document-level relation extraction.

https://share.streamlit.io/johngiorgi/seq2rel/main/demo.py

Apache License 2.0

60 stars 8 forks source link

Training without entity type token #377

Closed yuhangjiang22 closed 1 year ago

yuhangjiang22 commented 1 year ago

I am using Seq2rel on a dataset that only has a unique entity type. So, I'm thinking if I can remove the @entity_type@ token, to make the output to be like entity1 ; entity2 ; entity3 @predicate@

The problem I've found is that after training, the output sometimes contains unknown token.

For example of what I've got,

amphotericin ; flucytosine ; ketoconazole ; fluconazole @ @ unknown @ @ @ @ unknown @ @ @ @ unknown @ @ @ @ unknown @ @

Could you please let me know if there's a way of solving this?

JohnGiorgi commented 1 year ago

Sorry for the late response! Hmm, I think this should work fine, but any special tokens ("@predicate@") would need to be added to the vocabulary. You can just follow the existing configs to set this up. The steps are:

Empty the ent_tokens list

https://github.com/JohnGiorgi/seq2rel/blob/f757d6cc9da87ac527a9485d54843b6a5739657f/training_config/cdr.jsonnet#L35-L38

Replace the rel_tokens list with your list of special relation type tokens

https://github.com/JohnGiorgi/seq2rel/blob/f757d6cc9da87ac527a9485d54843b6a5739657f/training_config/cdr.jsonnet#L39-L41

The @@UNKNOWN@@ token comes from AllenNLP, and it gets used anytime the model tries to generate/copy a token outside the vocabulary.

yuhangjiang22 commented 1 year ago

Thanks for helping me on that! I think it works now. I also need to save the last epoch model to output directory, is there anywhere to mention in the config file?

JohnGiorgi commented 1 year ago

If I remember correctly this is controlled by the arguments to "trainer" and "checkpointer" in the config. Right now, our configs will save the single best model evaluated during training

https://github.com/JohnGiorgi/seq2rel/blob/f757d6cc9da87ac527a9485d54843b6a5739657f/training_config/cdr.jsonnet#L214-L216

and evaluation frequency is controlled by the "should_validate_callback"

https://github.com/JohnGiorgi/seq2rel/blob/f757d6cc9da87ac527a9485d54843b6a5739657f/training_config/cdr.jsonnet#L207-L213

I think by modifying the arguments to "trainer", "checkpointer" and "should_validate_callback" you should be able to achieve what you want. The AllenNLP documentation might be helpful here :)

JohnGiorgi commented 1 year ago

Closing! Please feel free to re-open if you are still having trouble.