Open ArneBinder opened 2 months ago
command:
python src/utils/prepare_data.py \
--input_dir=data/dataset \
--output_dir=data/dataset_prepared_all_for_evaluation \
--add_gold_data \
--re_revert_ra_relations \
--re_remove_none_relations
Notes:
This requires #27.
command:
python src/evaluation/eval_official.py \
--mode=illocutions \
--predictions_dir=data/dataset_prepared_all_for_evaluation \
--gold_dir=data/dataset \
--silent
result:
Processing nodesets: 100%|██████████| 1405/1405 [00:04<00:00, 325.11it/s]
general.p: 0.966773671100999
general.r: 0.9501140461143109
general.f1: 0.9570199945849113
focused.p: 0.8514861887815619
focused.r: 0.8342946934622179
focused.f1: 0.8414676836345346
INFO:src.utils.nodeset_utils:Successfully processed 1405 nodesets (0 blacklisted). Failed to process the following nodesets (0): []
command:
python src/evaluation/eval_official.py \
--mode=arguments \
--predictions_dir=data/dataset_prepared_all_for_evaluation \
--gold_dir=data/dataset \
--silent
result:
Processing nodesets: 34%|███▍ | 478/1405 [00:00<00:01, 555.35it/s]WARNING:__main__:nodeset_id=19737: No focused true relations found
Processing nodesets: 55%|█████▌ | 773/1405 [00:01<00:01, 573.24it/s]WARNING:__main__:nodeset_id=21566: No focused true relations found
Processing nodesets: 96%|█████████▋| 1353/1405 [00:02<00:00, 533.15it/s]WARNING:__main__:nodeset_id=18797: No focused true relations found
Processing nodesets: 100%|██████████| 1405/1405 [00:02<00:00, 550.13it/s]
INFO:src.utils.nodeset_utils:Successfully processed 1405 nodesets (0 blacklisted). Failed to process the following nodesets (0): []
general.p: 0.9668834406785285
general.r: 0.894833091154266
general.f1: 0.9171871425281932
focused.p: 0.8104196386115056
focused.r: 0.7353118115109915
focused.f1: 0.7591900472716043
TODO:
requires: #39
python src/utils/prepare_data.py \
--input_dir=data/dataset
--output_dir=data/dataset_prepared_val_for_evaluation
--add_gold_data
--re_revert_ra_relations
--re_remove_none_relations
--nodeset_whitelist="18275,20529,23789,25500,19345,21334,19214,17966,19084,20878,23837,18799,18889,19161,23810,21604,18265,23823,23892,20533,23604,23954,21566,21048,21285,18474,23728,23730,23797,23154,20740,18308,23537,19099,18756,23624,20519,23818,20318,23120,21406,21584,19763,23744,23834,19103,18788,23706,21585,23559,23767,21598,25518,22748,20338,23753,23494,17960,21408,21480,23904,18760,23802,20313,20513,21316,23830,19186,23718,19310,25556,19325,19153,18318,20830,21049,25424,21409,23484,23923,23808,21075,21283,23587,23600,21317,21607,18761,21709,23447,23497,19762,23953,20534,25384,19748,23917,21033,23768,23502,23606,21300,19078,19089,21568,19350,21660,18804,17941,21438,19210,23842,23615,21047,25722,21589,20492,20980,23280,21648,20500,25903,19340,25553,23918,20840,23584,21308,20865,23582,23602,19220,23599,21454,23508,21028,21277,23137,18482,21027"
--gold_dir=data/train/ \
--predictions_dir=data/dataset_prepared_val_for_evaluation/ \
--mode=arguments
general.p: 0.9712352572157533 general.r: 0.9117948618841476 general.f1: 0.9303856279239436 focused.p: 0.8285371702637893 focused.r: 0.7660763596914678 focused.f1: 0.7861181973697844
python src/evaluation/eval_official.py \
--gold_dir=data/train/ \
--predictions_dir=data/dataset_prepared_val_for_evaluation/ \
--mode=illocutions
general.p: 0.9671865717294916 general.r: 0.9486609426468183 general.f1: 0.9561468513010287 focused.p: 0.8417857142857142 focused.r: 0.8227085616944377 focused.f1: 0.8304712194442135
We do some sophisticated preprocessing with several simplifying assumptions. We should estimate how strong this impacts our overall setup by evaluating the prepared data with respect to the original data.
approach:
prepare_data.py
andadd_gold_data=true
, but also withre_revert_ra_relations=true
andre_remove_none_relations=true