This is fine for the "general" cases, however for the specific modules for datasets, it looks cumbersome:
...
dm = WMT16TranslationDataModule( # I have to choose the WMT16 class
cfg=TranslationDataConfig(
dataset_name="wmt16", # I have to pass the name of the dataset as well? why is this not the default?
...
),
tokenizer=tokenizer,
)
A solution would be to introduce a config class like below, however this adds even more lines.
...
@dataclass
class WMT16TranslationDataConfig:
dataset_name: str = "wmt16"
I suggest we opt to remove configs entirely, and just rely on the modules:
🚀 Feature
Currently, the process of creating a module looks like such:
This is fine for the "general" cases, however for the specific modules for datasets, it looks cumbersome:
A solution would be to introduce a config class like below, however this adds even more lines.
I suggest we opt to remove configs entirely, and just rely on the modules:
From my understanding, this is supported by hydra, but all
cfg
objects will be converted into parameters.