Lightning-AI / pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28k stars 3.35k forks source link

Lightning no longer works with non-primitive types in hparams #1095

Closed monney closed 4 years ago

monney commented 4 years ago

🐛 Bug

I will often use things like a layer definition passed in, for things like changing from batch norm to group norm or using a custom layer. Now that lightning uses the hparams feature in tensor board it errors out with these in hparams.

Code sample

from pytorch_lightning.loggers import TensorBoardLogger
from argparse import Namespace
params = Namespace(foo = range(10))
logger = TensorBoardLogger(save_dir='.')
logger.log_hyperparams(params)

Expected behavior

Lightning should convert non primitive types to strings before passing to the summary writer. This worked fine when logging hparams as a text object

Environment

Pytorch 1.4 Ubuntu 16 pip install pytorch_lightning Python 3.7

github-actions[bot] commented 4 years ago

Hi! thanks for your contribution!, great first issue!

awaelchli commented 4 years ago

First of all I am not sure if it's good practice to pass in objects like you described and call them "hyperparameters". But that's a different discussion.

The log_hyperparameters is not specific to TensorboardLogger, it is available in the base class for all loggers. It should be the logger's responsibility to handle the types of objects passed in as hparams. maybe not all loggers support the same amount of types (don't know). Converting to str seems reasonable to me, but to offer maximum compatibility, I would check first the types each logger supports and let each logger class handle it on their own (maybe some of the other loggers already convert to str internally by default).

monney commented 4 years ago

I agree the base logger should not cast to string, due to the reasons you mentioned. The Tensorboard Logger should though at the least esp. since it is the default logger. It's a one line code change and won't affect you if you only use primitive types. I'm not sure about the other loggers as I'm not aware of their handling of non-primitives.

Borda commented 4 years ago

Interesting point. @monney could you pls draw a use-case when you need to log a sequence of parameters? moreover range as function won't be seriacable... With the conversion to a string, it is a solution to save it but after load, it won't be easy to decode... we could do something like try to decode:

param = "[1, 3, 5, 9]"
try:
    param = eval(param)
except Exception:
    pass
monney commented 4 years ago

@Borda Use Cases for various non-primitives: Sequences: Num Layers as one hparams, per layer filter size or stride for each conv. layer. I use this to test topology changes across different resolutions.

Current Alternative: passing in a string to decoded in model, something like '[3,5,3,5,3,5,5]'

Namespaces: Better organization of parameters

Current Alternative: prefix on names like "backbone_init"

Functions: Swapping out compatible layers such as BatchNorm and InstanceNorm or changing activation functions. BigGAN's repo is a good example of doing this kind of thing: https://github.com/ajbrock/BigGAN-PyTorch

Current Alternative: Mapping within the model like "gn" for to use GroupNorm or "bn" to use BatchNorm

I try to make hparams as complete a description of the experiment as possible, so non-primitives are helpful for this, as the alternatives are messier.

Decoding:

I think it's reasonable to use eval, and assume an appropriate repr for what is passed in. With a warning if something fails. Currently I manually reconstruct hparams, and construct the model from there.

Pickling the whole hparams or each non-primitive also seems reasonable to me for full reproducibility.

Borda commented 4 years ago

Sounds good, would you mind sending a PR with these suggestions?