Open jgroschwitz opened 3 years ago
Whether or not an OOV token is added is controlled by the vocabulary class: https://docs.allennlp.org/v0.9.0/api/allennlp.data.vocabulary.html#allennlp.data.vocabulary.Vocabulary. You can adjust this in the config file; there already is an entry for "vocabulary" in jsonnets/emnlp20/glove/AMR-2015.jsonnet
for example. Of course the OOV token embedding will be untrained.
Thanks! That's what I was looking for
On Fri, 19 Nov 2021, 18:00 Matthias Lindemann, @.***> wrote:
Whether or not an OOV token is added is controlled by the vocabulary class: https://docs.allennlp.org/v0.9.0/api/allennlp.data.vocabulary.html#allennlp.data.vocabulary.Vocabulary. You can adjust this in the config file; there already is an entry for "vocabulary" in jsonnets/emnlp20/glove/AMR-2015.jsonnet for example. Of course the OOV token embedding will be untrained.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/coli-saar/am-parser/issues/95#issuecomment-974244015, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQGPCXCB6YNYLSTH5AII73UMZ7CZANCNFSM5IMGTJ3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
a toy model crashes when encountering an unknown NER label.
To reproduce: run
python3 -u train.py jsonnets/toyAMRAutomata.jsonnet -s example/toyAMRAutomataOutput/ -f --file-friendly-logging
on commit
1282115
on theunsupervised2020
branch.According to https://github.com/allenai/allennlp/issues/2147, crashing when encountering a label that is unseen is the intended behaviour as long as no OOV token (i.e. a token that says "i'm the OOV token") is in the vocabulary. My guess is that usually, such an OOV token gets added automatically, but not in this toy example.