Unclear about sharpening loss

Dear author,

Thank you for this nice code, I have a question about your code: in snmn/models_clevr_snmn/model.py has sharpen_lossfunction. I don't see the paper mentions this. Can you explain a little bit what it means? Why do you need minimize the cross-entropy between max(flat_probs, 10^-5) and flat_probs? Is it because that you encourage logit probability not to be lower than 10^-5, which acts as the regularization?

Thank you for your help, Best,

ronghanghu / snmn

Unclear about sharpening loss #10