Hi, I have tried to implement several MIL models, following a similar approach to your implementation. However, I noticed that the performance can vary significantly, up to 10%, just by changing the training seed (ex from seed 0 to 10). I suspect this is due to the small training dataset size (Camelyon16), and the final test performance is determined by the last epoch (50). Moreover, instabilities in training seem to occur, especially with the AB-MIL and DS-MIL models
Hi, I have tried to implement several MIL models, following a similar approach to your implementation. However, I noticed that the performance can vary significantly, up to 10%, just by changing the training seed (ex from seed 0 to 10). I suspect this is due to the small training dataset size (Camelyon16), and the final test performance is determined by the last epoch (50). Moreover, instabilities in training seem to occur, especially with the AB-MIL and DS-MIL models