ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Other
493 stars 181 forks source link

repulsion branch DistributedEnvironment exception produces no output #333

Closed bernstei closed 3 months ago

bernstei commented 7 months ago

I think the logger need to be set up before the calls that create DistributedEnvironment in cli/run_train.py, otherwise there's no output. Also, since it's an error, it should probably go to logger.error rather than .info, and return a non-zero status when the script terminates.

bernstei commented 7 months ago

I see now that the logger depends on rank, which isn't known until after the distributed env is created, and I'm not sure what's the simplest way of dealing with that. Maybe just setting the logging level very early, and setting the full logger later. But I still think the DistributedEnvironment exception message should be an error, no?