Hi! In CE Loss, the prediction is usually obtained by softmax. But I haven't found any softmax function applied to pred_u_w_corr in your code, which means pred_u_w_corr is model logit not the probability. Instead, I find the softmax is applied to the correlation map. I feel it a little unusual and wonder how you decide the loss function in this way. Thanks!
Hi! In CE Loss, the prediction is usually obtained by softmax. But I haven't found any softmax function applied to pred_u_w_corr in your code, which means pred_u_w_corr is model logit not the probability. Instead, I find the softmax is applied to the correlation map. I feel it a little unusual and wonder how you decide the loss function in this way. Thanks!