wandb / wandb

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
https://wandb.ai
MIT License
8.95k stars 658 forks source link

[Feature] Wandb sweeper for hydra #1856

Open ashleve opened 3 years ago

ashleve commented 3 years ago

Is your feature request related to a problem? Please describe. Currently wandb sweeps can be used alongside Hydra but it's not very convenient. Most of the times I would like to overwrite only certain parameters in hydra config, by downloading them from wandb sweep server. This is possible, but requires me to manage the overriding logic by myself.

Describe the solution you'd like It would be great if wandb provided a custom sweeper plugin for hydra, similar to the one that's available there for optuna: https://hydra.cc/docs/next/plugins/optuna_sweeper This way doing sweeps alongside hydra could be as easy as:

# initialize sweep and automatically override some of the config parameters
python train.py --multirun hparams_search=wandb_sweep.yaml 

Additional context I think supporting sweeps in hydra has been mentioned here recently: https://github.com/wandb/client/issues/1233

ariG23498 commented 3 years ago

Hey @hobogalaxy Thanks for the feature request We will look into this!

MohammedAljahdali commented 3 years ago

hey @hobogalaxy, I am currently trying to use hydra sweeps with wandb, but I am facing an issue where when a new run in the sweep starts and the id of the wandb run changes (I do that using the hydra:job.num) , and a new wandb object gets created with this new id, the version attribute of wandb get set to the same id as the first run in the sweep and does not change. Thus making it impossible to track separate runs of the sweep in wandb. Wondering if you have some ideas on this problem.

ashleve commented 3 years ago

Hi @MohammedAljahdali, I don't really see a reason for setting id of the wandb run to hydra job number? Why not just stay with random ids? If you want to keep it organized, I usually just come up with different name for each hydra sweep and set it to wandb "group" parameter (so it's easier filter it in UI). Something like: python train.py +experiment=exp_name --multirun model.lr=0.01,0.005 logger.wandb.group="mnist_conv_net_sweep1"

MohammedAljahdali commented 3 years ago

I found a solution to my issue which is calling wandb.finish. As for naming the run, I actually use meaningful name, but also concat the job number at the end to separate the sweep runs from each other. Also, using the group name is great idea. Thank you a lot 🙏🏻

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 60 days with no activity.

JayThibs commented 3 years ago

Commenting to express interest in this feature as well. :)

omerferhatt commented 2 years ago

Actually, this will pretty good against other libraries. I'm looking for it too.

WillDudley commented 2 years ago

Seeing as others are commenting, I may as well do the same. This would be a useful feature.

ErikEkstedt commented 2 years ago

Yes, I follow the other comments to express interest.

Right now when using sweeps

  1. I config my sweep yaml flags like "encoder.dim" (fields/sub-fields in my OmegaConf config.yaml)
  2. I then "manually" add arguments for the ArgumentParser in my model using a default OmegaConf config.yaml. Such that the ArgumentParser recognize my custom flags from the sweep and adds new fields automatically if I add new fields in the config file.
  3. when sweeps then calls "python train.py --flags_to_sweep --my_custom_layer.custom_feature I load the default OmegaConf and update any values according to ArgumentParser args and use the updated config object (must treat it as a dict because I now have "." in my argparser arguments) to build the model and train.

This works for any particular model but gets more complicated when I need to compose models using different default config.yaml files.... It may be a bad solution but it works for sweep and to use a OmegaConf yaml file to define my models (omitting composability).

Could there be a way to define that the sweep uses "+" (as done with hydra/OmegaConf) instead of the normal "--" argparse flags? That could perhaps solve the issue as I use it currently... but I guess it could break everything as well... anywhoo

Thanks for a great product.

ashleve commented 2 years ago

@ErikEkstedt This feature actually exist, it's hidden in the documentation here

Add this to sweep config:

command:
  - ${env}
  - ${interpreter}
  - ${program}
  - ${args_no_hyphens}

Although I still believe it would be more convenient if wandb implemented sweeps as a hydra plugin

mariomeissner commented 2 years ago

I'd also love to see this happening.An easy way to launch Wandb sweeps from a hydra project would be fantastic.

I'm especially missing the early-stopping feature that wandb sweeps have. Currently supported sweepers in Hydra don't allow that yet (e.g. pruning in Optuna). Wandb sweeps can also use pre-existing knowledge of previous runs to optimize the next sweep.

captain-pool commented 2 years ago

For posteriority, here are few resources I prepared for users trying to build Hydra + wandb projects.

  1. Report describing basics, known pitfalls, and solutions: https://wandb.ai/adrishd/hydra-example/reports/Configuring-W-B-Projects-with-Hydra--VmlldzoxNTA2MzQw

  2. minimal example of using W&B with hydra: https://github.com/wandb/examples/tree/master/examples/minimal-hydra-example

A hydra plugin integrating W&B sweeper with Hydra sweeper is also in the works. 😄 https://github.com/captain-pool/hydra-wandb-sweeper

EDIT: @scottire also added a page in the documentation: https://docs.wandb.ai/guides/integrations/other/hydra