Closed lianxxx closed 5 years ago
Thank you first for your code.
In your ablation studies, there is a result for "No TA". During training, how was the "MIL loss" calculated in this case? (Is there an MIL loss for each branch or only one in total for the averaged of all branches or else?)
"No TA" means the temporal attention is replaced with global average/max pooling (See here).
Other configurations are the same.
Thank you first for your code.
In your ablation studies, there is a result for "No TA". During training, how was the "MIL loss" calculated in this case? (Is there an MIL loss for each branch or only one in total for the averaged of all branches or else?)