marrlab / DomainLab

modular domain generalization: https://pypi.org/project/domainlab/
https://marrlab.github.io/DomainLab/
MIT License
42 stars 2 forks source link

Error in backpack when running a benchmark with fbopt, fishr, and erm. #836

Closed MatteoWohlrapp closed 3 months ago

MatteoWohlrapp commented 4 months ago

When running the pacs_fbopt_fishr_erm.yaml on the erm_hyper_init branch, I get the following error: image

smilesun commented 4 months ago

I just reproduced the same error, my gut feeling is since fbopt calls some fishr subroutines and takes model, where model is not backpacked.

See https://github.com/marrlab/DomainLab/pull/771

MatteoWohlrapp commented 4 months ago

But I get the same error when removing fbopt from the task and only use fishr. So I am not sure if it is connected to fbopt

smilesun commented 4 months ago

But I get the same error when removing fbopt from the task and only use fishr. So I am not sure if it is connected to fbopt

Backpack does not work with skip connections, seems like, see error message here https://github.com/marrlab/DomainLab/pull/837/files

E               NotImplementedError: Encountered BatchNorm module in training mode. BackPACK's computation will pass, but resul
ts like individual gradients may not be meaningful, as BatchNorm mixes samples. Only proceed if you know what you are doing.   

../anaconda3/lib/python3.9/site-packages/backpack/utils/errors.py:27: NotImplementedError                                      
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/sunxd/anaconda3/lib/python3.9/site-packages/backpack/utils/errors.py(27)batch_norm_raise_error_if_train()              
     25         )                                                                                                              
     26         if raise_error:                                                                                                
---> 27             raise NotImplementedError(message)                                                                         
     28         else:                                                                                                          
     29             warn(message)                                                              
smilesun commented 4 months ago

maybe compare with alex net then? @MatteoWohlrapp , nname=alexnet instead of npath

smilesun commented 4 months ago

@MatteoWohlrapp , this one seems to be running without error on my acoucnt:

https://github.com/marrlab/DomainLab/pull/838