Open clemsgrs opened 1 year ago
Hi, based on the following lines, it seems gradient accumulation is not properly implemented:
A proper implementation should look like the following:
loss = loss / gc loss.backward() if (batch_idx + 1) % gc == 0: optimizer.step() optimizer.zero_grad()
Hi! I'm also working on reproducing this HIPT paper. Would you be interested in some discussion?
sure, happy to chat. I’ve made my own version of the code here: https://github.com/clemsgrs/hipt
you can contact me at: clement (dot) grisi (at) radboudumc (dot) nl
Hi, based on the following lines, it seems gradient accumulation is not properly implemented:
https://github.com/mahmoodlab/HIPT/blob/a9b5bb8d159684fc4c2c497d68950ab915caeb7e/2-Weakly-Supervised-Subtyping/utils/core_utils.py#L285-L290
A proper implementation should look like the following: