Closed kristopherkyle closed 1 year ago
Hi Dan, we didn't know that it was possible to be granted an exception for the split. as our narabizi treebank has already been used for evaluation with a canonical split, would it be possible to benefit from the same exemption?
Thanks, Djamé
Hello all,
Our team is working on getting the initial release of UD_English-ESLSpok validated. The treebank is currently rather small (20k tokens), but we are in the process of annotating more data (we have another 50k tokens with manual XPOS tags that will be supplemented with UD annotations in the near future). So far, we have used the treebank in concert with other English UD treebanks (e.g., UD_English-ESL, UD_English-EWT, UD_English-GUM). For our purposes, it has been helpful to have a predetermined section of the data devoted to train/dev/test, which are added to these sections in other corpora when training and testing models. While we can certainly resample the data for the purpose of passing the validation checks, it would be nice to keep the distributions consistent across UD and our project homepage (which includes other data).
Is an exemption reasonable in this case?
Best,
Kris