verbose message to increase size_embeddings_count

lavis-nlp / jerex

PyTorch code for JEREX: Joint Entity-Level Relation Extractor

MIT License

63 stars 15 forks source link

verbose message to increase size_embeddings_count #11

Open e3oroush opened 2 years ago

e3oroush commented 2 years ago

Hi First, thank you for the great effort, I learned a lot from you.
I tried to use your model for my own dataset, but since the length of entity spans are a bit larger than the default size_embeddings_count in the config, I wasn't successful. I was getting this error message, which wasn't clear enough.

IndexError: index out of range in self

It took me a whole day to dig the bug up, I tried to change your code to have a more verbose message about this issue.
I hope it can help others with a similar problem.

markus-eberts commented 2 years ago

Hi @e3oroush,

thank you very much for your pull request! This would indeed be a nice addition. However, I think it is better to just check the largest entity span of the input datasets (train/dev/test) and raise an exception in case it exceeds the size_embeddings_count. This could be done after the datasets were loaded. This way, the exception (or assertion) doesn't occur sometime during the training process. What do you think?

e3oroush commented 2 years ago

Thanks for answering.
I'm not completely sure where you are exactly mentioning. But, because it's your code base, I think your idea makes definitely more sense than my PR. :smile: