Closed genghisun closed 1 year ago
A while ago the template had only one task with run.py
and config.yaml
and there were multiple issues raised about how to split it into several tasks:
https://github.com/ashleve/lightning-hydra-template/issues/151 https://github.com/ashleve/lightning-hydra-template/issues/200 https://github.com/ashleve/lightning-hydra-template/issues/103
I don't think unifying everything into a single task is the right approach since you often need something more sophisticated. For example, you might want to:
The current setup is just an example showcasing how to manage multiple tasks and configs. You can easily delete the evaluation task if you don't need it (train_task.py
already includes optional evaluation).
I don't like dumping every task into a single config - this makes them hard to read and manage since your configs might need to have a different structure for each task or use different types of modules.
As to having a single start file for all tasks, e.g. main.py
- we could probably do something like this:
python main.py task=train
Then you would need to do extra nesting for each command line override:
python main.py task=train task.datamodule.batch_size=64 task.model.optimizer.lr=0.01
Not very convenient in my opinion - you need to specify task=...
and write longer overrides every time you want to launch something.
These would be my arguments for doing it the current way, feel free to share your thoughts.
Thanks for your such detailed answer!
I strongly agree that we should not unifying all the tasks. There should be various tasks and corresponding configs for different complicated tasks such as those you listed.
But I still think there is a lot of redundancy in [entry files, task files and config files] for routine tasks such as simple training and evaluation. For example, what different entry files do is to call the corresponding task. And what different tasks (train/eval) do are just instantiate datamodule, model, logger, trainer, etc. through very similar configuration files, and then call the corresponding methods of trainer.
So I suggest unifying entry files for these regular training and evaluation tasks, which can be called main.py
or trainer.py
.
As for the extra nesting
problem, I think it can be solved with the subcommand like this:
python main.py fit(train) datamodule.batch_size=64 model.optimizer.lr=0.01
python main.py test datamodule.batch_size=64 model.optimizer.lr=0.01
python main.py build_dataset path=xxx
Thus the fit(train)
and test
can share same task and config, without extra longer overrides.
As another specific scene, build_dataset
can use different configs and arguments than regular training and evaluation tasks.
In terms of implementation, when the subcommand is in [fit(train)/validate/test/predict/...], the trainer_task.py
is called, and when the subcommand is other, the specific task is called by the name of subcommand, like build_dataset_task.py
.
@genghisun Could you elaborate on how such subcommend could be implemented? I'm not 100% sure but I don't think hydra has such feature and it's not possible to use normal argparse alongside hydra? Is there a workaround you have in mind?
I agree entry file and task file could be merged into one. The reasons they're not are:
setup_root()
being executed every time entry file is imported in tests. We might get away with it but it seems safer to avoid that.@hydra.main
decorator, so all imports from train_task.py
would need to be nested as well, and there's also a @utils.task_wrapper
decorator which needs to be imported outside of a function.Separating hydra logic (entry file) from task logic seemed cleaner to me, but that's mostly just my opinion and I'm open to contributions on this part.
AFAIK, In Pytorch Lightning, there are 4 running modes including [train/val/test/predict] (or maybe 5? including tune).
But now in lightning-hydra-template, for using these different running modes, different entry files [train.py/val.py/test.py(eval.py)/predict.py] and different tasks(pipelines) [train_task.py/eval_task.py/...] and different config files [train.yaml/eval.yaml/...] need to be specified, which is very redundant. (by default, only training and evaluation are implemented, and other modes need to be implemented by user themselves)
I wonder whether these running modes can be integrated into a unified run.py/main.py entry file, or even unified task(pipeline) and config.
Please feel free to share ideas and how this could be implemented. If possible, I would be happy to contribute.