yzhangcs / parser

:rocket: State-of-the-art parsers for natural language.
https://parser.yzhang.site/
MIT License
827 stars 139 forks source link

Fix SubwordField UNK encoding error #41

Closed matejklemen closed 4 years ago

matejklemen commented 4 years ago

First of all, thanks for this very clean implementation!

Upon running the biaffine dependency parser on Russian GSD treebank (using bert-base-multilingual-uncased), I've ran into a bug when preparing text for training: ValueError: too many dimensions 'str'.

The problem seemed to be that the string version of UNK token was being used as encoding instead of its int version.

yzhangcs commented 4 years ago

Great! Thanks for your correction.