microsoft / aurora

Implementation of the Aurora model for atmospheric forecasting
https://microsoft.github.io/aurora
Other
241 stars 31 forks source link

Multiple GPUs Inference #13

Open BigShuiTai opened 2 months ago

BigShuiTai commented 2 months ago

Hello, is there any way to run a inference with 2 or more GPUs?

wesselb commented 2 months ago

Hey @BigShuiTai! Thanks for opening an issue.

I'm guessing that you want to split a single version of the model across multiple GPUs, possibly to be able to run the model on GPU with less memory.

Unfortunately, this is not supported by the version of Aurora in this repository. Aurora, however, is just a plain PyTorch model, so model parallelism (I believe the kind that you're referring to) would be possible to implement.

luiservela commented 2 days ago

yes - lets make this happen!

firatozdemir commented 1 day ago

@wesselb is this something you are looking into in a near-future? Was this not implemented for the original training in the paper (i.e., was Aurora simply trained with Data parallelism + activation checkpointing for the Swin3D backbone)?