Closed Jaycolas closed 4 years ago
@Jaycolas It mostly depends on your data, in my case, it seems that the recall will less than the precision. Here is a info print in my validation step:
2018-05-22 19:53:18,572 - INFO - ︎☛ Predict by threshold: Recall 0.689941, Precision 0.793666, F 0.738178 2018-05-22 19:53:18,572 - INFO - Top1: Recall 0.592971, Precision 0.815848, F 0.68678 2018-05-22 19:53:18,572 - INFO - Top2: Recall 0.826358, Precision 0.615374, F 0.705428 2018-05-22 19:53:18,572 - INFO - Top3: Recall 0.929606, Precision 0.483677, F 0.636291 2018-05-22 19:53:18,573 - INFO - Top4: Recall 0.96868, Precision 0.381104, F 0.547002 2018-05-22 19:53:18,573 - INFO - Top5: Recall 0.9897, Precision 0.314537, F 0.477363
As you can see, if you use the different top K, the recall will go up and no doubt the precision will go down with k from 1 to 5.
You can change the threshold (default: 0.5), you will get higher recall and lower precision, but you will get higher F1, here is a sample using different threshold in my test step:
Step 80000 Recall 0.708631 Precision 0.699376 F 0.703972 (Threshold = 0.4) Step 80000 Recall 0.748062 Precision 0.679636 F 0.712209 (Threshold = 0.3)
So you want to optimize F1, you can:
There are some many ways to improve the performance.
Hi,
I am doing a simliar project as you did. I took a look at your text_cnn looks like the loss function you use is cross_entrophy. I wonder how does precision and recall look like when your loss start to converge? Did you optimize F1 specifically? Thanks!