omry / omegaconf

Flexible Python configuration system. The last one you will ever need.
BSD 3-Clause "New" or "Revised" License
1.88k stars 100 forks source link

Look into supporting Union in Structured configs #144

Open omry opened 4 years ago

omry commented 3 years ago

Bouncing to 2.2.

jieru-hu commented 2 years ago

comments from receipe team:

Does this Union support include the new syntax being introduced (eg. TypeA | TypeB) ?

kandluis commented 2 years ago

For context, this is the proposal (https://www.python.org/dev/peps/pep-0604/), which is introduced in Python 3.10 (which is now also made available at Meta).

It's a big readability win. I'd say it's of middle-to-low importance for us, but just checking if it'll work with the support here in hydra. I suspect so since I assume Union[A, B, ...] and A | B | ... will translate to the same underlying code, so might be entirely transparent to you.

Thanks for thinking of this!

Jasha10 commented 2 years ago

Yes, it should be possible to support PEP 604 in structured config type hints.

peterdavidfagan commented 5 months ago

Does there exist a workaround for this issue? I have model parameters like kernel dimension which have type Union[int, Sequence[int]] that is causing issues for omegaconf.

encoder:
  _target_: multi_modal_transformers.tokenizers.images.image_tokenizer.ImageTokenizer
  image_size: [280, 280, 3]
  patch_size: 56
  normalize: True # normalizes the input image pixel values to range [-1, 1]
  position_interval: 128 # number of discrete values for encoding patch positions
  rng_collection: patch_encoding # pseudo-random number parameter name
  embedding_dim: 512

  # positional embedding layers

  row_position_embedding:
    _target_: flax.linen.Embed
    name: "image_row_position_embedding"
    num_embeddings: ${tokenizers.images.encoder.position_interval}
    features: 512
    dtype: ${dtype}
    param_dtype: ${param_dtype}
    embedding_init:
      _target_: flax.linen.initializers.variance_scaling
      scale: 1.0
      mode: "fan_in"
      distribution: "normal"
      dtype: ${dtype}

  col_position_embedding:
    _target_: flax.linen.Embed
    name: "image_col_position_embedding"
    num_embeddings: ${tokenizers.images.encoder.position_interval}
    features: 512
    dtype: ${dtype}
    param_dtype: ${param_dtype}
    embedding_init:
      _target_: flax.linen.initializers.variance_scaling
      scale: 1.0
      mode: "fan_in"
      distribution: "normal"
      dtype: ${dtype}

  # input projection layer
  resnet:
    _target_: multi_modal_transformers.tokenizers.images.image_tokenizer.ResNetV2Block
    num_blocks: 2
    input_conv:
      _target_: flax.linen.Conv
      features: 64
      kernel_size: [12, 12]
      strides: [2, 2]
      padding: VALID
      use_bias: True
      dtype: ${dtype}
      param_dtype: ${param_dtype}
      kernel_init:
        _target_: flax.linen.initializers.he_normal
        dtype: ${param_dtype}
      bias_init:
        _target_: flax.linen.initializers.normal
        dtype: ${param_dtype}

    input_pool:
      _partial_: true
      _target_: flax.linen.max_pool
      window_shape: [3,3]
      strides: [1,1]
      padding: VALID

    # resnet blocks
    resnet_norm:
      _target_: flax.linen.GroupNorm
      num_groups: 32
      epsilon: 1e-6
      dtype: ${dtype}
      param_dtype: ${param_dtype}

    resnet_activation:
      _partial_: true
      _target_: flax.linen.gelu

    resnet_conv:
      _target_: flax.linen.Conv
      features: 64
      kernel_size: [3,3]
      strides: [1,1]
      padding: SAME
      use_bias: True
      dtype: ${dtype}
      param_dtype: ${param_dtype}
      kernel_init:
        _target_: flax.linen.initializers.he_normal
        dtype: ${param_dtype}
      bias_init:
        _target_: flax.linen.initializers.normal
        dtype: ${param_dtype}

    # output_layer
    output_dense:
      _target_: flax.linen.Dense
      features: 512
      kernel_init:
        _target_: flax.linen.initializers.he_normal                                                                                                                                                            
        dtype: ${param_dtype}                                                                                                                                                                                  
      bias_init:                                                                                                                                                                                               
        _target_: flax.linen.initializers.normal                                                                                                                                                               
        dtype: ${param_dtype}

Error:

omegaconf.errors.ConfigValueError: Unions of containers are not supported:
strides: Union[int, Sequence[int]]
    full_key: 
    object_type=None

I seem to get this when I have nested configuration files containing _target_ methods with Unions.

odelalleau commented 5 months ago

Does there exist a workaround for this issue?

Using Any for typing should work.

peterdavidfagan commented 5 months ago

Using Any for typing should work.

Thanks @odelalleau, makes sense. I guess I was hoping I might be able to do something that wouldn't involve editing the function signatures of a dependent library, I could do so but I wanted to understand if I could do something with OmegaConf to circumvent this error.

I've added correct and relaxed (Any) type annotations in my own code (where relaxed version is for testing purposes) but it looks like the underlying libraries I am using are causing the issue. For instance the below module is instantiated in my architecture, it contains modules with types that are Unions (e.g. conv).

class ResNetV2Block(nn.Module):
    num_blocks: Any
    # input projection layer
    input_conv: Any
    input_pool: Any
    # resnet blocks
    resnet_norm: Any
    resnet_activation: Any
    resnet_conv: Any
    # output_layer
    output_dense: Any

    @nn.compact
    def __call__(self, x):
        # start with convolution projection
        x = instantiate(self.input_conv)(x)
        x = call(self.input_pool)(x)

        # resnetv2block
        residual = x

        for _ in range(self.num_blocks):
            x = instantiate(self.resnet_norm)(x)
            x = call(self.resnet_activation)(x)
            x = instantiate(self.resnet_conv)(x)

        if residual.shape != x.shape:
            residual = instantiate(self.resnet_conv)(residual)

        x = x+residual

        #flatten output
        x = jnp.reshape(x, (*x.shape[:3], -1))
        x = instantiate(self.output_dense)(x)

        return x

This resnet block is instantiated as the tokenizer within a larger architecture which I wanted to manage with Hydra.

odelalleau commented 5 months ago

@peterdavidfagan You need to change it in the structured config definition (the dataclass).

peterdavidfagan commented 5 months ago

@peterdavidfagan You need to change it in the structured config definition (the dataclass).

Ah ok I am still a bit of a hydra noob, I'll reread the docs and update my resolution here for reference. I have simply been reading my config yaml as follows and not interacting directly with the structured config definition (starting to read up on this now):

hydra.core.global_hydra.GlobalHydra.instance().clear()
initialize(version_base=None, config_path="../../model_configs", job_name="gato")
GATO_CONFIG = compose(
        config_name="gato_base",
        )

Ah ok the docs are super helpful. Thanks @odelalleau your response helped get me on the right track, I probably should have consulted the docs on structured configs in greater depth.

odelalleau commented 5 months ago

@peterdavidfagan glad if I could help, but from what you shared it looks like you may not actually be using structured configs since you're just using the compose API with yaml files (unless there is some other code -- maybe triggered when importing another library -- that would define the structured config in a dataclass?)

peterdavidfagan commented 5 months ago

Thanks @odelalleau, in the end I fixed the bug in my code. I did try to manually alter the configstore before composing my configs where I would specify the types for various fields in a dataclass definition, unfortunately this didn't work but likely due to my bug being caused by something else.

What was happening in my case was that the _recursive_ flag in one of my instantiate methods was set to True rather than False. Rather than passing a DictConfig to subsequent calls to instantiate my code was passing the instantiated layer, having recursive set to true resulted in the error I was getting.

Super happy to have refactored my hydra config, it is now making my code very succinct and easy to dynamically configure different versions of my architecture.

mshvartsman commented 2 months ago

Are there any updates on this functionality? It would be very useful for cases like the following:

@dataclass
class TrainingConfig:
    optimizer_config: SGDConfig | AdamConfig

(where different optimizers might have different configs). I see https://github.com/omry/omegaconf/pull/913 mentions that:

Unions of containers (e.g. Union[List[str], Dict[str, str], MyStructuredConfig]) are not yet supported. This will come in a followup PR.

but I have not seen any other updates.

alexdremov commented 1 month ago

Are there updates on this?

@Jasha10 @omry

Jasha10 commented 1 month ago

@alexdremov no updates at this time. Currently OmegaConf is being maintained by volunteers -- we would need someone to take on the implementation and write tests.

alexdremov commented 1 month ago

@alexdremov no updates at this time. Currently OmegaConf is being maintained by volunteers -- we would need someone to take on the implementation and write tests.

Got it. Thanks for your reply!

Maybe I'll try to dig into it soon