Closed harveenchadha closed 1 year ago
@harveenchadha can you confirm what version of wandb you're using by running wandb --version
?
I am on the latest version:
0.12.2
@harveenchadha if you're able to share a colab that can reproduce this that would really help us get to the bottom of it. We think it's because the summary is too large for us to sync in the current time we allow. The next release of the library should increase this timeout, but if we can reproduce ourselves before than we can likely get you a workaround.
My email is vanpelt@wandb.com if you want to share a private colab.
I can give you a public kaggle kernel
Here you will find the error.
If something is wrong with my code structure, do let me know!
Hey @harveenchadha we'll need to look into a fix for this ahead of the next release. Until then, I think the simplest solution would be to use the keras Tensorboard callback and just add sync_tensorboard=True
to wandb.init
.
This issue is stale because it has been open 60 days with no activity.
Is this still an issue? Running into a similar problem on TPU (when using wandb with pytorch lightning):
Exception in device=TPU:0: problem
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/torch_xla/distributed/xla_multiprocessing.py", line 331, in _mp_start_fn
_start_fn(index, pf_cfg, fn, args)
File "/usr/local/lib/python3.8/dist-packages/torch_xla/distributed/xla_multiprocessing.py", line 325, in _start_fn
fn(gindex, *args)
File "/home/christoph.minixhofer/lightning/src/pytorch_lightning/strategies/launchers/xla.py", line 100, in _wrapping_function
results = function(*args, **kwargs)
File "/home/christoph.minixhofer/lightning/src/pytorch_lightning/trainer/trainer.py", line 644, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/christoph.minixhofer/lightning/src/pytorch_lightning/trainer/trainer.py", line 1085, in _run
self._log_hyperparams()
File "/home/christoph.minixhofer/lightning/src/pytorch_lightning/trainer/trainer.py", line 1153, in _log_hyperparams
logger.log_hyperparams(hparams_initial)
File "/home/christoph.minixhofer/.local/lib/python3.8/site-packages/lightning_utilities/core/rank_zero.py", line 24, in wrapped_fn
return fn(*args, **kwargs)
File "/home/christoph.minixhofer/lightning/src/pytorch_lightning/loggers/wandb.py", line 426, in log_hyperparams
self.experiment.config.update(params, allow_val_change=True)
File "/home/christoph.minixhofer/.local/lib/python3.8/site-packages/wandb/sdk/wandb_run.py", line 253, in wrapper
raise e
File "/home/christoph.minixhofer/.local/lib/python3.8/site-packages/wandb/sdk/wandb_run.py", line 248, in wrapper
wandb._attach(run=self)
File "/home/christoph.minixhofer/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 841, in _attach
raise UsageError("problem")
wandb.errors.UsageError: problem
@MiniXC and anyone else stumbling upon this (as it seems to be the lone search result for the mysterious "wandb.errors.UsageError: problem") - upgrading pytorch lightning to 1.9.0 and wandb to 0.13.10 solved this issue for me.
Hi @MiniXC, I wanted to follow up on this request. Were you able to find the solution provided by @mx781 helpful? Please let us know if we can be of further assistance or if your issue has been resolved.
Hi @MiniXC , since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!
Description Adding Keras WandbCallback fails and throws an error.