In the current implementation the data import happens at the time the PyTorch Lightning DataModule is initialised.
This is a bit messy, and also makes unit testing harder.
Splitting out the data import and the datamodule will make the code clearer, allow for easier adaptability, and allow for unit testing or even data quality checks before training begins.
In the current implementation the data import happens at the time the PyTorch Lightning DataModule is initialised.
This is a bit messy, and also makes unit testing harder.
Splitting out the data import and the datamodule will make the code clearer, allow for easier adaptability, and allow for unit testing or even data quality checks before training begins.