Closed qiweijian closed 1 year ago
My bad, I forgot to set the f1score to macro f1
f1_score(out_label_ids, preds, average='macro')
and the result is
I/E F1: 0.6680585160428386
S/N F1: 0.6327190984052741
T/F F1: 0.7980612647061289
P/J F1: 0.6506210984344566
Average F1: 0.6873649943971746
For the test step, your can modify the main.py:
from trainer import train, get_labels, test
and after line 203
new_state_dict = OrderedDict()
for k,v in model_state_dict.items():
name = k[7:] # remove 'module'
new_state_dict[name] = v
model.load_state_dict(new_state_dict)
_, _, ave_f1, _ = test(args, test_dataset , model)
The test results are recorded into 'output1/test_result.txt'
We get the results:
1: acc = 0.8005763688760807
-----------------
1: f1 = 0.7022590833521805
-----------------
2: acc = 0.8564841498559078
-----------------
2: f1 = 0.6688497296849814
-----------------
3: acc = 0.813256484149856
------------------
3: f1 = 0.8112473910275524
------------------
4: acc = 0.7095100864553314
------------------
4: f1 = 0.6812029154161818
------------------
test_ave_f1= 0.715889779870224
Here are hyper-parameter settings:
07/07/2023 07:32:57 - INFO - __main__ - hyperparameter = Namespace(adam_epsilon=1e-06, all_gpu_eval_batch_size=32, all_gpu_train_batch_size=8, alpha_learning_rate=0.01, d_model=768, device=device(type='cuda'), dropout=0.2, final_hidden_size=128, gcn_dropout=0.2, gcn_hidden_size=768, gcn_mem_dim=64, gcn_num_layers=2, gm_learning_rate=1e-05, gradient_accumulation_steps=1, l0=False, learning_rate=1e-05, logging_steps=25, max_alpha=100, max_grad_norm=1.0, max_len=70, max_post=50, max_steps=-1, model_dir='../scr4/bert-base-cased', no_dart=False, no_special_node=False, num_classes=2, num_mlps=2, num_train_epochs=30.0, option='test', other_learning_rate=0.001, output_dir='output1', pretrain_type='bert', seed=321, single_hop=False, task='kaggle')
When I use your checkpoint best_f1_dggcn_kaggle_321.pth , there is a missing key 'pretrain_models.embeddings.position_ids' and it seems that there shows no code computing f1.
So I turn Line #200 in scr/main.py into
model.load_state_dict(new_state_dict, strict=False)
And add the following F1 computing at the end of test loop
The output I get is