Where is the update/forward function call for the metric in the above line?
For context: On debugging it seems that this metric (and a few others) have nan values being logged. And it seems that there isn't any update call for them in the code base resulting in those nan values. This also causes "val\the_metric" to also be nan and seems to mess up the top 1 model checkpointing.
https://github.com/microsoft/FIBER/blob/29671fa165b0813cc05b75ff56b1fab71bfc920c/coarse_grained/fiber/modules/fiber_utils.py#L115
Hi,
Where is the update/forward function call for the metric in the above line?
For context: On debugging it seems that this metric (and a few others) have nan values being logged. And it seems that there isn't any update call for them in the code base resulting in those nan values. This also causes "val\the_metric" to also be nan and seems to mess up the top 1 model checkpointing.