Training on larger dataset has high cost associated with each epoch (e.g., ANI2x trained on the ANI2x dataset currently takes ~8h for a single epoch on a GTX 3090). Before committing, we should spend a few days profiling the training pipeline.
Todos
[x] Profiling script for model forward/backward pass (using PyTorch Profiler)
[x] Profiling a training scenario (using PyTorch Lightning Profiler)
Description
Training on larger dataset has high cost associated with each epoch (e.g., ANI2x trained on the ANI2x dataset currently takes ~8h for a single epoch on a GTX 3090). Before committing, we should spend a few days profiling the training pipeline.
Todos
Status