NVIDIA / modulus

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods
https://developer.nvidia.com/modulus
Apache License 2.0
947 stars 222 forks source link

[CorrDiff] Move the training loop into the modulus package #420

Open nbren12 opened 6 months ago

nbren12 commented 6 months ago

Specifically, I am proposing this

git mv examples/generative/corrdiff/training/training_loop.py modulus/utils/generative/

This will allow other projects to use the training loop by importing it. We have close duplicates of this file in 3-4 internal research projects, and are encountering the same bugs and adding the same features. A modulus implementation of the training loop would be widely usable across EDM-based diffusion projects.

The training loop is also the most frequently modified part of corrdiff, but it has no unit tests to ensure that we aren't breaking it. I feel some interfaces and extension points added that could make it more testable and general.

The current workaround is to add modulus as a submodule or otherwise modify the python path to point to examples/generative/corrdiff/ in an ad-hoc manner. In practice, most will probably just copy-paste it, which leads to code duplication.

mnabian commented 6 months ago

@ram-cherukuri could you please review this request?