Closed CorentinSeznec closed 1 month ago
Good job ! I just wonder if we should have one big datamodule that manages all datasets, or one datamodule per dataset ? I think that if we want to use lightning CLI, it uses the second options. @A669015 what do you think ?
Yes that is the philosophy behind the Datamodule to be associated to a Dataset. I guess a "big" generic Datamodule could be used if the datasets are transparently replaceable each other (finally that is the philosophy of py4cast) . Is there for some datasets the requirement of dataset-specific parameters, that could then make less nice the usage of the lightning CLI with a generic datamodule ?
Good job ! I just wonder if we should have one big datamodule that manages all datasets, or one datamodule per dataset ? I think that if we want to use lightning CLI, it uses the second options. @A669015 what do you think ?
The dataset name being a str arg to the datamodule I think even as it is the CLI would introspect and be able to pass the argument from the cli to the datamodule and thus dataset selection would work.
data:
class_path: py4cast.lightning.plDataModule
init_args:
dataset: titan
...
But it is probably more flexible to have one DataModule subclass per dataset.