Open XiJinping01 opened 3 years ago
I think you can try to adjust the threshold. Maybe you can set the threshold to around 8.3 which get better performance in my experiment. ------------------ Original ------------------ From: @.>; Date: Wed, Jul 14, 2021 10:04 AM To: @.>; Cc: @.***>; Subject: [ChestnutWYN/ACL2021-Novel-Slot-Detection] reproduce your results (#1)
I read your code and try to reproduce the results you reported in the paper. Here are changes I made.
in main.py: in last 3 lines, I change parse_token to parse_line to run SpanF1 evaluation.
my script is
--mode test \ --dataset SnipsNSD5% \ --threshold 8.0 \ --output_dir ./output_both \ --batch_size 256 \ --cuda 0the result I got
{ "precision-overall": 0.7729279058361942, "recall-overall": 0.8814317673378076, "f1-overall": 0.8236216357459654, "precision-nsd": 0.17073170731707313, "recall-nsd": 0.4064516129032255, "f1-nsd": 0.2404580152671338, "precision-ind": 0.9059880239520958, "recall-ind": 0.9265156154317208, "f1-ind": 0.9161368452921087 }`` are
f1-nsdand
f1-indSpanF1?
f1-nsd` seem much lower than reported. Thank you!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
By setting threshold to 8.3 I got
"precision-overall": 0.8368421052631579,
"recall-overall": 0.889261744966443,
"f1-overall": 0.8622559652927917,
"precision-nsd": 0.262443438914027,
"recall-nsd": 0.3741935483870965,
"f1-nsd": 0.3085106382978237,
"precision-ind": 0.912447885646218,
"recall-ind": 0.9381506429883649,
"f1-ind": 0.9251207729468099
closer to results reported. Thanks
I read your code and try to reproduce the results you reported in the paper. Here are changes I made.
in
main.py
: in last 3 lines, I changeparse_token
toparse_line
to run SpanF1 evaluation.my script is