wandb / wandb

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
https://wandb.ai
MIT License
8.9k stars 653 forks source link

run.histories return is empty #8280

Open MoH-assan opened 1 week ago

MoH-assan commented 1 week ago

Regarding this method https://github.com/wandb/wandb/blob/b944790d960dfc3b34ff939fb7992f0a0546dad1/wandb/apis/public/runs.py#L171

Doing this runs.histories(x_axis = "epoch",keys=["test_prec1_crosponding_to_best_val"],format = 'pandas') or this key2df = runs.histories(x_axis = "epoch",keys=["test_top1"],format = 'pandas') behaves as expected

but doing this key12df = runs.histories(x_axis = "epoch",keys=["test_prec1_crosponding_to_best_val","test_top1"],format = 'pandas') returns an empty dataframe.

I suspect this because the keys are logged with different steps.

It would be nice if key12df = runs.histories(x_axis = "epoch",keys=["test_prec1_crosponding_to_best_val","test_top1"],format = 'pandas')

behave like

key1df = runs.histories(x_axis = "epoch",keys=["test_prec1_crosponding_to_best_val"],format = 'pandas')
key2df = runs.histories(x_axis = "epoch",keys=["test_top1"],format = 'pandas')
joined_df = pd.merge(key1df, key2df, on = "epoch", how = "inner")

On a different note It would also be nice to have methods,

ArtsiomWB commented 1 week ago

Hi @MoH-assan! Thank you for writing in.

To confirm, you are interested in being able to call multiple keys as part of the keys argument when calling the runs.history api function, right?

MoH-assan commented 1 week ago

@ArtsiomWB
Yes. More specifically, it is when the two keys don't share the same step. For example

 wandb.log({'key1':key1,'epoch':epoch}) 
 wandb.log({'key2':key2,'epoch':epoch}) #Note that W&B will give 'key2' different step from 'key1'

I want to be able to do the below and not get an empty data frame. runs.histories(x_axis = "epoch",keys=["key1","key2"],format = 'pandas')

ArtsiomWB commented 5 days ago

When you say:

More specifically, it is when the two keys don't share the same step.

Just to confirm, in the example above, the epoch for key 1 and key 2 will remain the same, but the step will increment since you are calling wandb.log() for the second time. Therefore, you're interested in pulling both keys via the API based on epoch rather than step, so you can compare them at epoch 1 vs. epoch 1. If you were comparing them by step, it would be step 1 vs. step 2. Does that happen to sound right to you?

MoH-assan commented 5 days ago

This is correct.

Currently, when I try to pull both keys via the API based on the epoch I get an empty data frame.

runs.histories(x_axis = "epoch",keys=["key1","key2"],format = 'pandas') The above code returns an empty data frame.

ArtsiomWB commented 3 days ago

Thank you for the details! I have gone ahead and submitted the feature request to our engineering team. I will let you know if we have any follow-up questions regarding this feature request!