Open jwijffels opened 5 years ago
I'm just starting this; any suggestions are welcome! I'm first just getting things up-and-running. Next steps will be parameter tuning, and experimenting with more training data. I'll have a look at the settings you linked to. However, i still have to decide on a good test / evaluation dataset. I'm planning to use Lassy (all of it) for training, not just the parts in the Universal Dependencies. That will also mean that both the alpino and lassy test sets will be a bit small in comparison to the training set.
By the way, do you know of any other tools i should look at? What are you using?
Mainly using udpipe as it is a good balance between ease of use and quality. But maybe you could also add https://stanfordnlp.github.io/stanfordnlp/ (having tried this though) as it gave also good results in the last shared task
Nice to see better numbers appearing for UDPipe by changing the hyperparameters to the default adivsed ones.
I was also comparing udpipe models with alpino results a few months ago. I wonder why you haven't trained the udpipe models with the training settings used by the UDPipe author for the Alpino dataset (https://github.com/ufal/udpipe/tree/master/training/models-ud-2.0) and next evaluated it on Lassysmall. When I trained it with these parameters on Alpino that UDPipe model give me better results than the Alpino parser if it was evaluated on an external dataset Lassysmall