ashleve / lightning-hydra-template

PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
4.26k stars 654 forks source link

wandb logger not working #328

Closed losredoe132 closed 2 years ago

losredoe132 commented 2 years ago

Hi there, thank you for this powerful template! I run into a problem while trying to use wandb as logger I used the wandb-callbacks branch and after python train.py logger=wandb i get (cancelled by user after 130 iterations cause wandb login does not appear)

$ python train.py logger=wandb
┌────┬───────────────┬──────────────────┬────────┐
│    │ Name          │ Type             │ Params │
├────┼───────────────┼──────────────────┼────────┤
│ 0  │ model         │ SimpleDenseNet   │  336 K │
│ 1  │ model.model   │ Sequential       │  336 K │
│ 2  │ model.model.0 │ Linear           │  200 K │
│ 3  │ model.model.1 │ BatchNorm1d      │    512 │
│ 4  │ model.model.2 │ ReLU             │      0 │
│ 5  │ model.model.3 │ Linear           │ 65.8 K │
│ 6  │ model.model.4 │ BatchNorm1d      │    512 │
│ 7  │ model.model.5 │ ReLU             │      0 │
│ 8  │ model.model.6 │ Linear           │ 65.8 K │
│ 9  │ model.model.7 │ BatchNorm1d      │    512 │
│ 10 │ model.model.8 │ ReLU             │      0 │
│ 11 │ model.model.9 │ Linear           │  2.6 K │
│ 12 │ criterion     │ CrossEntropyLoss │      0 │
│ 13 │ train_acc     │ Accuracy         │      0 │
│ 14 │ val_acc       │ Accuracy         │      0 │
│ 15 │ test_acc      │ Accuracy         │      0 │
│ 16 │ val_acc_best  │ MaxMetric        │      0 │
└────┴───────────────┴──────────────────┴────────┘
Trainable params: 336 K
Non-trainable params: 0
Total params: 336 K
Total estimated model params size (MB): 1
Epoch 0    ----- ---------------------------------- 130/939 0:00:04 • 0:00:28 29.28it/s loss: 0.252
Error executing job with overrides: ['logger=wandb']

(Note the last line)

Changing logger: wandb in train.yaml does not work either. I'm a bit confused because i had it working once before but just don't know what to do anymore. I tried out different conda envs with different torch and pl versions. Does anyboady have an idea?

pip list

Package                 Version
----------------------- ------------
absl-py                 1.1.0
aiohttp                 3.8.1
aiosignal               1.2.0
alembic                 1.8.0
antlr4-python3-runtime  4.8
anyio                   3.6.1
argon2-cffi             21.3.0
argon2-cffi-bindings    21.2.0
asttokens               2.0.5
async-timeout           4.0.2
atomicwrites            1.4.0
attrs                   21.4.0
autopage                0.5.1
Babel                   2.10.1
backcall                0.2.0
beautifulsoup4          4.11.1
black                   22.3.0
bleach                  5.0.0
cachetools              5.2.0
certifi                 2022.5.18.1
cffi                    1.15.0
cfgv                    3.3.1
charset-normalizer      2.0.12
click                   8.1.3
cliff                   3.10.1
cmaes                   0.8.2
cmd2                    2.4.1
colorama                0.4.4
colorlog                6.6.0
commonmark              0.9.1
cycler                  0.11.0
debugpy                 1.6.0
decorator               5.1.1
defusedxml              0.7.1
distlib                 0.3.4
docker-pycreds          0.4.0
entrypoints             0.4
executing               0.8.3
fastjsonschema          2.15.3
filelock                3.7.1
flake8                  4.0.1
fonttools               4.33.3
frozenlist              1.3.0
fsspec                  2022.5.0
gitdb                   4.0.9
GitPython               3.1.27
google-auth             2.6.6
google-auth-oauthlib    0.4.6
greenlet                1.1.2
grpcio                  1.46.3
hydra-colorlog          1.2.0
hydra-core              1.1.0
hydra-optuna-sweeper    1.2.0
identify                2.5.1
idna                    3.3
importlib-metadata      4.11.4
importlib-resources     5.7.1
iniconfig               1.1.1
ipykernel               6.13.0
ipython                 8.4.0
ipython-genutils        0.2.0
isort                   5.10.1
jedi                    0.18.1
Jinja2                  3.1.2
joblib                  1.1.0
json5                   0.9.8
jsonschema              4.6.0
jupyter-client          7.3.1
jupyter-core            4.10.0
jupyter-server          1.17.0
jupyterlab              3.4.2
jupyterlab-pygments     0.2.2
jupyterlab-server       2.14.0
kiwisolver              1.4.2
Mako                    1.2.0
Markdown                3.3.7
MarkupSafe              2.1.1
matplotlib              3.5.2
matplotlib-inline       0.1.3
mccabe                  0.6.1
mistune                 0.8.4
multidict               6.0.2
mypy-extensions         0.4.3
nbclassic               0.3.7
nbclient                0.6.4
nbconvert               6.5.0
nbformat                5.4.0
nest-asyncio            1.5.5
nodeenv                 1.6.0
notebook                6.4.11
notebook-shim           0.1.0
numpy                   1.22.4
oauthlib                3.2.0
omegaconf               2.1.2
optuna                  2.10.0
packaging               21.3
pandas                  1.4.2
pandocfilters           1.5.0
parso                   0.8.3
pathspec                0.9.0
pathtools               0.1.2
pbr                     5.9.0
pickleshare             0.7.5
Pillow                  9.1.1
pip                     21.2.2
platformdirs            2.5.2
pluggy                  1.0.0
pre-commit              2.19.0
prettytable             3.3.0
prometheus-client       0.14.1
promise                 2.3
prompt-toolkit          3.0.29
protobuf                3.20.1
psutil                  5.9.1
pudb                    2022.1.1
pure-eval               0.2.2
py                      1.11.0
pyasn1                  0.4.8
pyasn1-modules          0.2.8
pycodestyle             2.8.0
pycparser               2.21
pyDeprecate             0.3.2
pyflakes                2.4.0
Pygments                2.12.0
pyparsing               3.0.9
pyperclip               1.8.2
pyreadline3             3.4.1
pyrsistent              0.18.1
pytest                  7.1.2
python-dateutil         2.8.2
python-dotenv           0.20.0
pytorch-lightning       1.6.4
pytz                    2022.1
pywin32                 304
pywinpty                2.0.5
PyYAML                  6.0
pyzmq                   23.1.0
requests                2.27.1
requests-oauthlib       1.3.1
rich                    12.4.4
rsa                     4.8
scikit-learn            1.1.1
scipy                   1.8.1
seaborn                 0.11.2
Send2Trash              1.8.0
sentry-sdk              1.5.12
setproctitle            1.2.3
setuptools              61.2.0
sh                      1.14.2
shortuuid               1.0.9
six                     1.16.0
smmap                   5.0.0
sniffio                 1.2.0
soupsieve               2.3.2.post1
SQLAlchemy              1.4.37
stack-data              0.2.0
stevedore               3.5.0
tensorboard             2.9.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
terminado               0.15.0
threadpoolctl           3.1.0
tinycss2                1.1.1
toml                    0.10.2
tomli                   2.0.1
torch                   1.11.0+cu113
torchaudio              0.11.0+cu113
torchmetrics            0.9.0
torchvision             0.12.0+cu113
tornado                 6.1
tqdm                    4.64.0
traitlets               5.2.2.post1
typing_extensions       4.2.0
urllib3                 1.26.9
urwid                   2.1.2
urwid-readline          0.13
virtualenv              20.14.1
wandb                   0.12.17
wcwidth                 0.2.5
webencodings            0.5.1
websocket-client        1.3.2
Werkzeug                2.1.2
wheel                   0.37.1
wincertstore            0.2
yarl                    1.7.2
zipp                    3.8.0
ashleve commented 2 years ago

wandb-callbacks haven't been maintained for a while and it might not work correctly with recent lightning and hydra releases.

Have you trained using the main branch?

I'm preparing new release and will fix the callbacks when it's ready https://github.com/ashleve/lightning-hydra-template/issues/308

losredoe132 commented 2 years ago

So i managed to get it working using a fresh conda environment: torch==1.10.0 with CUDA10.2 pytorch-lightning==1.6.4 wandb == 0.12.17

I doesnt check if all the callbacks work properly but my initial problem is solved. Thank you for your help!