Closed kylebgorman closed 2 months ago
This is now ready for review.
I ran a series of evaluations using the same hyperparameters I give in the example configs. This is totally untuned, but the results are very promising, and I think exceed those reported in the thesis.
en, EWT:
BERT:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ test_feats_accuracy │ 0.9625540375709534 │
│ test_lemma_accuracy │ 0.9711591601371765 │
│ test_upos_accuracy │ 0.9596463441848755 │
│ test_xpos_accuracy │ 0.9554420709609985 │
└───────────────────────────┴───────────────────────────┘
MWEs removed, BERT:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ test_feats_accuracy │ 0.9624212980270386 │
│ test_lemma_accuracy │ 0.974416196346283 │
│ test_upos_accuracy │ 0.9590739011764526 │
│ test_xpos_accuracy │ 0.9553279876708984 │
└───────────────────────────┴───────────────────────────┘
ru, SynTagRus:
UD features, mBERT:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ test_feats_accuracy │ 0.9279443621635437 │
│ test_lemma_accuracy │ 0.9742450714111328 │
│ test_upos_accuracy │ 0.98211270570755 │
└───────────────────────────┴───────────────────────────┘
UD features, XLM-RoBERTa:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ test_feats_accuracy │ 0.936280369758606 │
│ test_lemma_accuracy │ 0.9777073264122009 │
│ test_upos_accuracy │ 0.9837583899497986 │
└───────────────────────────┴───────────────────────────┘
UD features, RuBERT:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ test_feats_accuracy │ 0.9382235407829285 │
│ test_lemma_accuracy │ 0.9699346423149109 │
│ test_upos_accuracy │ 0.985562264919281 │
└───────────────────────────┴───────────────────────────┘
Here goes nothing.
This monster PR makes UDTube a properly installable Python package and fixes a bunch of other nuisance issues.
It does not exactly close #3 but the indexes/label encoders are now all stored in the same file, which should make that easier to implement.