wellcometrust / deep_reference_parser

A deep learning model for extracting references from text
MIT License
24 stars 1 forks source link

Preparing for new release #32

Closed lizgzil closed 4 years ago

lizgzil commented 4 years ago

Just updating the version.py and makefile for release with 18th march data and the 5th april multitask model.

I've set the version number to 5th april as to make it clear that this is the latest model training date, but perhaps this should match the date of creation of the release instead?

ivyleavedtoadflax commented 4 years ago

I think that because model iterations have moved at a much faster rate than releases, if you rename the model to match the release, there will be two model runs that exist with the same version which might be a bit confusing :man_shrugging:

lizgzil commented 4 years ago

@ivyleavedtoadflax in this case I'm renaming the version to match the model.

lizgzil commented 4 years ago

@aCampello good point - I forgot to add the config file! (which says where the data is from)

ivyleavedtoadflax commented 4 years ago

Hey @lizgzil , sorry got to this a bit late. Would be good to update the README with the latest scores for that model run, which were:

      author     0.9432    0.9485    0.9458      5902
           o     0.9297    0.9519    0.9407     20664
       title     0.9227    0.8788    0.9002     11016
        year     0.8747    0.8662    0.8704       725
   micro avg     0.9289    0.9288    0.9288     38307
   macro avg     0.9176    0.9114    0.9143     38307
weighted avg     0.9287    0.9288    0.9285     38307
2020-04-13 13:19:10 deep_reference_parser.logger INFO: Metrics not taking padding into account:
2020-04-13 13:19:10 deep_reference_parser.logger INFO: 
              precision    recall  f1-score   support
         b-r     0.9134    0.9087    0.9111       789
         e-r     0.8989    0.8597    0.8788       734
         i-r     0.9611    0.9844    0.9726     26645
           o     0.9632    0.9050    0.9332     10139
   micro avg     0.9595    0.9594    0.9595     38307
   macro avg     0.9341    0.9145    0.9239     38307
weighted avg     0.9595    0.9594    0.9591     38307

Also it looks like the continuous integration and code coverage are no longer reporting to PRs, not sure if this is because it was me who set it up? You could try making me a collaborator to re-enable, but in theory they should be set up through the Wellcome account.