Models always predict positive

yaoxiao1999 commented 4 months ago

Hello,

The outputs of both run_prompt_finetune.py and run_prompt_finetune_test.py showed that the models always predicted positive labels. I tried both BERT and RoBERTa as the PLM.

There's this warning but I doubt it's the cause of the problem: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior.

Here are some of the outputs: roberta-base_tempmanual1_verbmanual_full_100/version_2/checkpoints/epoch7/test_class_report.csv

precision,recall,f1-score,support 0.0,0.0,0.0,24.0 0.5,1.0,0.6666666666666666,24.0 0.5,0.5,0.5,0.5 0.25,0.5,0.3333333333333333,48.0 0.25,0.5,0.3333333333333333,48.0

roberta-base_tempmanual1_verbmanual_full_100/version_2/checkpoints/epoch7/test_results.csv

id,labels,pred_labels,logits,probas S191,1,1,"[-17.013824462890625, 0.0]",1.0 S185,1,1,"[-17.208816528320312, 0.0]",1.0 S184,0,1,"[-17.171653747558594, 0.0]",1.0 S190,1,1,"[-16.760562896728516, 0.0]",1.0 S186,0,1,"[-17.0897216796875, 0.0]",1.0 S192,1,1,"[-17.786958694458008, 0.0]",1.0 S179,1,1,"[-17.382776260375977, 0.0]",1.0 S178,0,1,"[-17.05199432373047, 0.0]",1.0 S193,0,1,"[-17.250530242919922, 0.0]",1.0 S187,1,1,"[-17.341604232788086, 0.0]",1.0 S183,0,1,"[-16.963104248046875, 0.0]",1.0 S197,0,1,"[-17.426101684570312, 0.0]",1.0 S168,1,1,"[-17.467975616455078, 0.0]",1.0 S169,1,1,"[-17.246917724609375, 0.0]",1.0 S196,0,1,"[-17.10784339904785, 0.0]",1.0 S182,1,1,"[-17.37942123413086, 0.0]",1.0 S194,1,1,"[-17.32719612121582, 0.0]",1.0 S180,0,1,"[-17.15471649169922, 0.0]",1.0 S181,1,1,"[-17.489299774169922, 0.0]",1.0 S195,0,1,"[-17.271026611328125, 0.0]",1.0 S198,1,1,"[-17.088821411132812, 0.0]",1.0 S173,1,1,"[-17.162525177001953, 0.0]",1.0 S167,1,1,"[-16.777942657470703, 0.0]",1.0 S205,1,1,"[-17.280344009399414, 0.0]",1.0 S204,0,1,"[-17.570472717285156, 0.0]",1.0 S166,0,1,"[-16.379688262939453, -1.1920930376163597e-07]",0.9999998807907104 S172,0,1,"[-17.208892822265625, 0.0]",1.0 S199,0,1,"[-17.22469139099121, 0.0]",1.0 S164,1,1,"[-17.543907165527344, 0.0]",1.0 S170,0,1,"[-18.19196319580078, 0.0]",1.0 S206,0,1,"[-16.8443546295166, 0.0]",1.0 S207,0,1,"[-17.230087280273438, 0.0]",1.0 S171,1,1,"[-17.202341079711914, 0.0]",1.0 S165,1,1,"[-17.40441131591797, 0.0]",1.0 S161,0,1,"[-17.302696228027344, 0.0]",1.0 S175,0,1,"[-17.135976791381836, 0.0]",1.0 S203,1,1,"[-17.166648864746094, 0.0]",1.0 S202,0,1,"[-17.2541446685791, 0.0]",1.0 S174,0,1,"[-17.102859497070312, 0.0]",1.0 S160,0,1,"[-17.246959686279297, 0.0]",1.0 S189,1,1,"[-17.24028968811035, 0.0]",1.0 S176,1,1,"[-17.115135192871094, 0.0]",1.0 S162,1,1,"[-17.42736053466797, 0.0]",1.0 S200,1,1,"[-17.705970764160156, 0.0]",1.0 S201,0,1,"[-17.091449737548828, 0.0]",1.0 S163,0,1,"[-16.920034408569336, 0.0]",1.0 S177,0,1,"[-17.251707077026367, 0.0]",1.0 S188,1,1,"[-17.18240737915039, 0.0]",1.0

bert-base-uncased_tempmanual1_verbmanual_full_100/version_1_val/checkpoints/epoch9/test_class_report_cv1_fold8.csv

precision,recall,f1-score,support 0.0,0.0,0.0,5.0 0.5,1.0,0.6666666666666666,5.0 0.5,0.5,0.5,0.5 0.25,0.5,0.3333333333333333,10.0 0.25,0.5,0.3333333333333333,10.0

bert-base-uncased_tempmanual1_verbmanual_full_100/version_1_val/checkpoints/epoch9/test_results_cv1_fold8.csv

id,labels,pred_labels,logits,probas S040,0,1,"[-15.645161628723145, -1.1920930376163597e-07]",0.9999998807907104 S064,0,1,"[-15.673526763916016, -1.1920930376163597e-07]",0.9999998807907104 S061,0,1,"[-16.19318389892578, -1.1920930376163597e-07]",0.9999998807907104 S038,0,1,"[-15.856155395507812, -1.1920930376163597e-07]",0.9999998807907104 S015,0,1,"[-16.03573989868164, -1.1920930376163597e-07]",0.9999998807907104 S151,1,1,"[-15.546436309814453, -1.1920930376163597e-07]",0.9999998807907104 S093,1,1,"[-15.4844970703125, -2.3841860752327193e-07]",0.9999997615814209 S096,1,1,"[-15.949499130249023, -1.1920930376163597e-07]",0.9999998807907104 S135,1,1,"[-15.73129940032959, -1.1920930376163597e-07]",0.9999998807907104 S128,1,1,"[-15.96599006652832, -1.1920930376163597e-07]",0.9999998807907104

Could you please help? Thank you!

yiwang454 commented 4 months ago

Hi Yao, I'm wondering have you fine-tuned BERT or RoBERTa? In this project, the prompt-based AD detection paradigm only works when I used BERT or RoBERTa that were fine-tuned on ADReSS training data. The fine-tuning codes are included in this repo as well. If you are actually running it on the fine-tuned BERT / RoBERTa model, do you mind checking the gradient when you fine-tune them to see if the gradient actually propagates to the model?

yaoxiao1999 commented 4 months ago

Hi, thank you for replying. I made sure that do_training is True and the logs also suggested that fine-tuning happened. Or do you mean that I should fine-tune BERT/RoBERTA before prompt-based fine-tuning? i.e., in the off_line_model_dir I should provide the fine-tuned model? If that's true, could you direct me to where the fine-tuning code is please as I struggle to find it.

Thank you!

yiwang454 commented 4 months ago

I double-checked my code, and realised that I removed some hard-coded local model path for the simplicity of the codebase when putting the code on GitHub ... so when after the fine-tuning, when you conduct inference using the scripts run_prompt_finetune_test.py, you may want to change the following two lines in the script prompt_finetune.py ,

model_dict = {'bert-base-uncased': os.path.join(args.off_line_model_dir, 'bert-base-uncased'),
            'roberta-base': os.path.join(args.off_line_model_dir, 'roberta-base'),}

into lines like this:

model_dict = {'bert-base-uncased': off_line_model_dir + 'bert-base-uncased',
            'bert-tuned28': off_line_model_dir + 'bert_post_train/epoch28',
            'bert-tuned29': off_line_model_dir + 'bert_post_train/epoch29',
            'bert-tuned30': off_line_model_dir + 'bert_post_train/epoch30',}

Here I stored the fine-tuned models in the off_line_model_dir folder's subdirectory bert_post_train, so you want to specify your own fine-tuned model directory in model_dict, and specify the --model_name argument to be the corresponding key in the model_dict. (For example if I want to use the 28th epoch fine-tuned model stored at bert_post_train/epoch28, I need to set --model_name bert-tuned28.

But except for the above, you probably don't need to fine-tune BERT / RoBERTa before the prompt-based fine-tuning. Pls see if this solve the problem, and sorry about the inconvenience.

yiwang454 / prompt_ad_code

Models always predict positive #3