Closed evangeliazve closed 2 years ago
Sorry, I think github's ipynb rendering might have a bug for HTML-like tags. The append call on span_idxs
should be as follows:
span_idxs.append([
[idx for idx, token in enumerate(tokens) if token.startswith(\"<S:\")][0],
[idx for idx, token in enumerate(tokens) if token.startswith(\"</S:\")][0],
[idx for idx, token in enumerate(tokens) if token.startswith(\"<O:\")][0],
[idx for idx, token in enumerate(tokens) if token.startswith(\"</O:\")][0]
])
Hello,
Thank you very much. I think the problem is solved. I've seen you made some changes to Case #5.
I would also like to know how can I jointly train NER and RE ? Should I do this separetly or do you suggest any practice ? Does you method allows saving model to Hugging Face for Open Source use ?
If you prefer I can create new threads on these subjects.
Best Regards,
Evangelia ZVE
I haven't tried joint NER+RE training, i.e. situations where you have a single network that is trained to predict (and with) NER and RE annotations simultaneously given a sentence. I did read a paper saying that is a preferred approach since there is less chance of upstream errors with NER affecting RE, but in the experiments the authors tried, joint networks did not perform as well as pipelined networks.
Some recent blogs I have been looking at refer to pipelined networks as joint networks (https://towardsdatascience.com/how-to-train-a-joint-entities-and-relation-extraction-classifier-using-bert-transformer-with-spacy-49eb08d91b5c, for example). My preference is for this approach -- first train a NER to predict entities, then take entity pairs in some unit of text (sentence, paragraph) and train a RE to predict relations.
To save to HF, you will need to have a HF account and login to the account from your code and then push the trained model to HF (check this notebook for an example) -- https://colab.research.google.com/drive/1U7SX7jNYsNQG5BY1xEQQHu48Pn6Vgnyt?usp=sharing). I haven't done that here.
Hello,
Thanks a lot for your detailed answer.
Best Regards, Evangelia Zve
You are very welcome. Closing this issue.
Hello Sujit Pal,
Thank you for this amazing work.
I have searched accross the web how to tackle jointly NER and RE fine tuning and your solution is the most streight forward with Python Code included.
I am stacked to the CMD 18 as token.startswith(") isn't a valid operation
Could you please explain this command ? Do you mean token.startswith(' ') ?
Best Regards,
Evangelia