Filter-Bubble / filterbubbel-nlp

Retrain state of the art NLP models for Dutch
1 stars 1 forks source link

Interesting to see these comparisons #1

Open jwijffels opened 5 years ago

jwijffels commented 5 years ago

I was also comparing udpipe models with alpino results a few months ago. I wonder why you haven't trained the udpipe models with the training settings used by the UDPipe author for the Alpino dataset (https://github.com/ufal/udpipe/tree/master/training/models-ud-2.0) and next evaluated it on Lassysmall. When I trained it with these parameters on Alpino that UDPipe model give me better results than the Alpino parser if it was evaluated on an external dataset Lassysmall

jiskattema commented 5 years ago

I'm just starting this; any suggestions are welcome! I'm first just getting things up-and-running. Next steps will be parameter tuning, and experimenting with more training data. I'll have a look at the settings you linked to. However, i still have to decide on a good test / evaluation dataset. I'm planning to use Lassy (all of it) for training, not just the parts in the Universal Dependencies. That will also mean that both the alpino and lassy test sets will be a bit small in comparison to the training set.

jiskattema commented 5 years ago

By the way, do you know of any other tools i should look at? What are you using?

jwijffels commented 5 years ago

Mainly using udpipe as it is a good balance between ease of use and quality. But maybe you could also add https://stanfordnlp.github.io/stanfordnlp/ (having tried this though) as it gave also good results in the last shared task

jwijffels commented 5 years ago

Nice to see better numbers appearing for UDPipe by changing the hyperparameters to the default adivsed ones.