inspirehep / beard

Bibliographic Entity Automatic Recognition and Disambiguation
Other
66 stars 36 forks source link

examples: new name features #71

Closed MSusik closed 9 years ago

MSusik commented 9 years ago

Signed-off-by: Mateusz Susik mateusz.susik@cern.ch

MSusik commented 9 years ago

I can imagine that these features can be improved, but how?

MSusik commented 9 years ago

Before:

Number of blocks = 13114
True number of clusters 15575
Number of computed clusters 15811
B^3 F-score (overall) = 0.9800464510382015
B^3 F-score (train) = 0.9867310903532298
B^3 F-score (test) = 0.9796475584448197
MSusik commented 9 years ago

After:

Number of blocks = 13114
True number of clusters 15575
Number of computed clusters 15601
B^3 F-score (overall) = 0.9815810532568245
B^3 F-score (train) = 0.9876010112437174
B^3 F-score (test) = 0.9812041547435574

Note that I used old clusters file and not the best sampling strategy for both runs.

glouppe commented 9 years ago

Other than my comments + what we discussed offline, this is good to go once you have done the changes. Thanks!

glouppe commented 9 years ago

Can I merge this?

MSusik commented 9 years ago

Yes, ready.