replicate / keepsake

Version control for machine learning
https://keepsake.ai
Apache License 2.0
1.65k stars 72 forks source link

Integration with Hydra #550

Open felixkreuk opened 3 years ago

felixkreuk commented 3 years ago

Hi,

I'm trying to use Keepsake along with Hydra (www.hydra.cc). Hydra changes the working directory of each experiment to a new path (e.g., /runs/exp_lr=0.1_dropout=0.2). So when trying to call keepsake.init I get an error that there's not keepsake.yaml file in the working directory (because it is changed by Hydra). It's not possible to call keepsake.init with an absolute path according to the docs. Did someone got keepsake to play along with hydra?

Thanks

Edit: OK, I was able to use the Project class to specify the paths manually, and then use the CLI with the -D flag to specify the path. I think that allowing the absolute path in keepsake.init will be much easier. Am I missing something here?

lkhphuc commented 3 years ago

In the latest version, I was able to run keepsake with hydra by simply add repository: file://.keepsake in the keepsake.yaml in the root directory, not the one created for each run by hydra.

However I was wondering about the best practice for intergrating these two. Hydra manage different run by creating a directory for each run and by default all the output (logs, checkpoints, hparams) are dumped into that folder. Keepsake did a similar thing, by compressing the entire root directory and stored it in repository: .... As far as I understand, in keepsake you're supposed to overwrite the run's output at every experiment and every checkpoints, so when zipped into .keepsake there will be no duplication of artifacts from other runs. This seems to be in contradiction with hydra-managed output directory, right?

zeke commented 2 years ago

Hi @felixkreuk and @lkhphuc

The Keepsake project is no longer actively maintained. If you're interested in helping maintain it, please let us know.