megvii-research / hpman

A hyperparameter manager for deep learning experiments.
MIT License
95 stars 11 forks source link

Support for nested hyperparameters #3

Closed bigeagle closed 2 years ago

bigeagle commented 5 years ago

It is quite common to have a nested config file for being more readable.

For example, the author have a yaml config like this:

discriminator:
  in_channels: 3
  spectral: true
  norm: 'instance'
  activation: 'leaky_relu'
  residual: true
  input_size: [512, 512]

Support nested hyperparameter definition could extend the realm of this library further.

bigeagle commented 5 years ago

Syntax Proposal One (The nothing change way)

d = _('discriminator', {
    'in_channels': 3,
    'spectral': True,
})
d['in_channels']  # permitted
d['in_channels'] = 4  # not captured by hpman
d['hahaha'] = 5  # not captured by hpman either

Pros:

  1. Zero implementation cost.

Cons:

  1. User has to be aware of what is exactly going on. They have to with care, which would impose greater mental burden using this library.
  2. Error-prone.

Syntax Proposal Two (The flat way)

Akin to python import paths, dot in hyperparameter name indicates hierarchy:

_('discriminator.in_channels', 3)
_('discriminator.spectral', True)

The library would automatically group these hyperparameters by the implicit tree structure defined by these hyperparameter names.

Pros:

  1. Easy to implement; adheres to the current design
  2. Explicit. Straight forward to understand, low mental burden.

Cons:

  1. Code seems tedious. The same thing repeated too many times.

Syntax Proposal Three (The attribute way)

Huazuo once proposed an alternative syntax, as well as its implementation at here.

However, it is still not solving the "repeating problem", as the example usage indicates:

from libhpman import _

print(_.a.b.c('default')) # default
print(_.a.b.c()) # default

_.a.b.c._ = 'hey' # Currently assignments must goes to the "_" attribute
print(_.a.b.c('default')) # hey

Following this path, we can extend this syntax further:

from hpman.m import _
d = _.discriminator
d.in_channels(3) # setter
d.spectral(2)  # setter
d.spectral()  # getter

Pros:

  1. Write less. Not repeating your self.

Cons:

  1. Implementation might be tricky. It is assumed to deal with very limited cases. User may often jump out of the design boundary.
  2. Readability of the code may drop down easily. It is too easy to write intricate and confusing code using this pattern. Code renders implicit.
bigeagle commented 5 years ago

Syntax Proposal Four

from hpman.m import hp
hp = hp.namespace('dataset')
w = hp("crop_rate", 0.2)

Pros:

  1. Almost no API modification, people can still use global namespace without the new API
  2. Runtime is easy to implement

Cons:

  1. The parser has state changes

More complicated case:

from hpman.m import hp

def some_function():
    hp = hp.namespace('dataset')
    w = hp("crop_rate", 0.2)

def some_other():
    hp = hp.namespace('train')
    w = hp("weight_decay", 0.2)

def main():
    w = hp("model_shape", 0.2)

In python level, the hyper params are dataset.crop_rate, train.weight_decay and model_shape. However, the parser can hardly detect hp's real namespace.

bigeagle commented 5 years ago

I would like to add some opinions on proposal four: Pros:

  1. write less code

Cons:

  1. the hp = hp.namespace('dataset') use is quite confusing (not only parser, but also user must maintain a state). The user pays additional cost recognizing that `hp("crop_rate")` is dataset.crop_rate while hp("model_shape") is model_shape. However, if you write more to make it clearer, e.g.,
    hp_dataset = hp.namespace('dataset')
    w = hp_dataset("crop_rate", 0.2)

    then it is writing about the same amount of code as proposal two, renders it losing the "write less code" edge.


A final thought: should we set up some principles like "stateless is better than stateful for explicitness and less mental burden"?

bigeagle commented 5 years ago

Syntax Proposal Five

from hpman.m.dataset import hp
w = hp("crop_rate", 0.2)

Pros:

  1. Write less code
  2. Easy to parse

Cons:

  1. The singleton HyperParamManager needs to be well implemented

  2. What if I want both from hpman.m.dataset import hp and from hpman.m.train import hp at the same time?

  3. In proposal five, is hp a fixed name or any arbitrary name will do?

    • If it is the former case, someone would certainly complain about encountering name conflicts.
    • If it is the latter case, how would you deal with the following case:
      from hpman.m.dataset import hp0
      from hpman.m.train import hp1
      u = hp0("crop_rate", 0.2)
      v = hp1("model_shape", 0.2)

      How would you parse u and v at the same time?

bigeagle commented 5 years ago

New Design

@bigeagle came up with a potential better design:

e.g.:

from hpman.m import _
_('a.b', 1)
_('c', {'d': 1})

_('a.b') # returns 1
_('c')  # returns {'d': 1}
_('c.d')  # returns 1
_('a') # returns {'b': 1}

I would propose one more reasonable design choice on top of this is to waive the enforcement of disallowing setting a parent hyperparameter if children exist, e.g.:

from hpman.m import _
_('a.b', 1)
_('a', {'c': 1})
_('a.c')

The behavior of setting parent hyperparameter and the children at the same time would be leaf undefined.

It is the responsibility of the user to be consistent with their behavior

Drawbacks

However, this would not only complicate the static analysis but also exposes the vulnerability of being less comprehensive. Borrow the example above:

_('a.b', 1)
_('c', {'d': 1})

_('a.b')  # pass static analysis, runtime returns 1
_('c')  # pass static analysis, runtime returns {'d': 1}
_('c.d')  # hard example, pass static analysis, runtime returns 1.
_('a')  # hard example, pass static analysis, runtime returns {'b': 1} 

_.set_value('c', {'e', 1})
_('c.e')  # fail static analysis, runtime returns 1
_('c.d')  # pass static analysis, runtime raises KeyError

In the _('c.d') case above, the user would have no idea what it means unless he finds the definition of either 'c' or 'd'.

Furthermore, if the user encounters _('a.b.c.d'), he would be struggling to make sense of what the type of data is by finding definitions all its prefixes: 'a', 'a.b', 'a.b.c' and 'a.b.c.d'.

This is way too hard to understand for a newbee.

bigeagle commented 5 years ago

f52467c3 implements the new design.

_.set_value('c', {'e', 1}) hardly happens, since set_value is usually used to load hp from db/yml/etc.. If nested hyperparams are prefered, users usually use _.set_tree(tree).

Further, given the current version where nested hps are not available, users cannot pass this case

_('c', {'d': 1})
_.set_value('c', {'e', 1})

_('c')['d']  # fail

The fundamental issue is: do we need a hp structure with schema?

bigeagle commented 5 years ago

Implementing proposal two has reached a dead end of mixing tree structure with dict value type.

The root cause is that in current hpmgr implementation there is only flattened key-value structure. In the inner-tree implementation, there is no way to distinguish a subtree from a dict regardless of the top syntax.

I propose that the code of hp declaration defines the schema or inner structure, and _.set_value set hp values according to inner schema and caller value.

hughplay commented 3 years ago

Any updates about the nested hyperparameters? It seems that @bigeagle has already implemented this feature, but the latest version 0.10.0 has not included it yet?

By the way, I use hpman and hpargparse recently in my training. They are awesome and easy to use. 😄