hipster-philology / pandora

A Tagger-Lemmatizer for Natural Languages
MIT License
9 stars 4 forks source link

Share a Latin model ? #93

Open PonteIneptique opened 6 years ago

PonteIneptique commented 6 years ago

Hey there :) Would one of you be so kind to share a Latin model ? It seems my training (12c_pytorch) capped because of my dataset (The Perseus/Perseids Latin treebank data is very messy...). I am trying to evaluate some post-correction techniques but I'd like a model that performs a little better in generate mode

89250/89250 [==============================] - 53s - loss: 1.0329 - pos_out_loss: 0.7977 - lemma_out_loss: 0.2351     
::: Train Scores (lemma) :::
+   all acc: 0.437821482602118
+   kno acc: 0.437821482602118
+   unk acc: 0.0
::: Dev Scores (lemma) :::
+   all acc: 0.4366640440597954
+   kno acc: 0.5041632461592588
+   unk acc: 0.08592321755027423
::: Train scores (pos) :::
+   all acc: 0.7638930912758447
+   kno acc: 0.7638930912758447
+   unk acc: 0.0
::: Dev scores (pos) :::
+   all acc: 0.7565892997639654
+   kno acc: 0.7689691567960596
+   unk acc: 0.692260816575259
Jean-Baptiste-Camps commented 6 years ago

By the way, I will work this afternoon on the presentation of Pandora for tomorrow (COSME workshop). I will mention results with Old French, but I'll gladly take anything you can send me in the way of results with Latin, etc., both with label and generate (I don't use this last one, so I don't have any feedback on it).

PonteIneptique commented 6 years ago

Regarding #99, maybe you could use a previous release?