wandb / examples

Example deep learning projects that use wandb's features.
http://wandb.ai
1.12k stars 290 forks source link

Cross validation #67

Closed ariG23498 closed 2 years ago

ariG23498 commented 3 years ago

The example on cross-validation is a little tricky to understand. These were my key takeaways:

Let me know if there is something that I missed. My issue is that, the individual runs for each process (k-folds) gets overwritten in the UI. I am not sure as to why this happens. This might be because of the process .join() or wandb.join() method. I have also tried with wandb.finish().

phil-fradkin commented 3 years ago

I'm having the same issue with runs being overwritten in the UI. I would really appreciate if the wandb team could comment on if this behavior is not supported or if I am doing something wrong. Many thanks!

vanpelt commented 3 years ago

Hey @phil-fradkin you'll need to provide the code you're running that's causing the unexpected behavior for us to advise.

phil-fradkin commented 3 years ago

Right so in general I would like to set up a sweep where every single one of my runs is actually a cross validation. Instead of getting a single metric from a set of hyperparameters the sweep would either get k (for kfold) or I could take the average. At a high level my code looks like this:

args = get_args()
group_id = timestamp()
for i in range(3):
    wandb_run = wandb.init(project, name, jobtype=group_id, reinint=True)
    data = load_data(i)
    model = load_model(args)
    # inside train model there is wandb.log
    train_model(model, data)
    summary_dict = evaluation(model, data)
    wandb.log(summary_dict)
    checkpoint_fp = os.path.join(wandb.run.dir, "checkpoint.pkl")
    torch.save(model.state_dict(), checkpoint_fp)
    wandb_run.finish()

When I run this script outside the sweep the UI summarizes information across all 3 runs:

image

The sidebar also visualizes the 3 individual runs:

image

However when I try to do the same thing in a sweep

For every single job_type it creates a single run

image

and has a single loss curve in the UI.

I've tried replacing the job_type with group but the result is the same. Ideally I would like my sweep to either optimize the average of the metric taken across models from different folds. Alternatively it can treat the cross validated models individually.

Please let me know if I should clarify anything else further or this use case isn't supported

phil-fradkin commented 3 years ago

Hi @vanpelt does what I wrote make sense or do you need me to provide a working code example?

annawoodard commented 2 years ago

Hi @vanpelt!

I've tried running the example without modification in wandb version 0.12.6. I was expecting something like what is linked in the readme here, but this is what I see. There is only one entry corresponding to a null group-- presumably the average over all runs. Do you know how I could tweak the example to get a point in this plot for each job type, corresponding to the average over all runs with that job type (i.e., hyperparameter combo?)

melifluos commented 2 years ago

Hi @vanpelt I'm also having some issues with the example. I get the same overwriting behaviour reported above when I run without using multiprocessing. When I add multiprocessing, everything looks like the example on CPU, but when I run on multiple GPUs using

#!/bin/bash

for i in {0..7}
do
    CUDA_VISIBLE_DEVICES=$(($i % 8)) wandb agent bchamberlain/research-repo-sheaf_exp/$1 &
done

everything hangs and only 1 GPU is utilised. Thanks in advance for any help on this.

tcapelle commented 2 years ago

Do you still need help on this? I am closing this issue, feel free to re-open