In the head embedding_head.py we output cls_outputs and pred_class_logits.
In the meta_arch baseline.py the pred_class_logits are used for accuracy logging while cls_outputs are used as input for CE loss.
When using e.g. the provided bagtricks config, we use any_softmax.Linear as cls_layer. This class seems to be unfinished or just wrong. As long as scale=1 (default) it is just the identity. Instead the linear layer is hardcoded into embedding_head using a weight Tensor. What is this mess?
Anyway, who am I to judge this spaghetti code... The main problem is that as a result of this mess, pred_class_logits and cls_outputs are both just the plain logits which is fine for cls_outputs as this is used for CE, where softmax is applied but not for pred_class_logits because in the logging function there is no softmax.
A greater reworking of embedding_head seems in order but for a hot-fix, one can simply manually add softmax in embedding_head pred_class_logits output. (Have not tested yet but will update once I have, if I don't forget)
In the head
embedding_head.py
we outputcls_outputs
andpred_class_logits
. In the meta_archbaseline.py
thepred_class_logits
are used for accuracy logging whilecls_outputs
are used as input for CE loss. When using e.g. the provided bagtricks config, we useany_softmax.Linear
ascls_layer
. This class seems to be unfinished or just wrong. As long as scale=1 (default) it is just the identity. Instead the linear layer is hardcoded into embedding_head using a weight Tensor. What is this mess?Anyway, who am I to judge this spaghetti code... The main problem is that as a result of this mess,
pred_class_logits
andcls_outputs
are both just the plain logits which is fine forcls_outputs
as this is used for CE, where softmax is applied but not forpred_class_logits
because in the logging function there is no softmax.A greater reworking of embedding_head seems in order but for a hot-fix, one can simply manually add softmax in embedding_head
pred_class_logits
output. (Have not tested yet but will update once I have, if I don't forget)Or is there something I am missing?