omry / omegaconf

Flexible Python configuration system. The last one you will ever need.
BSD 3-Clause "New" or "Revised" License
1.94k stars 105 forks source link

Merging Multiple Configs with Interpolation has Unexpected Precedence #1184

Open mharradon opened 1 month ago

mharradon commented 1 month ago

When merging multiple configurations it would be nice (arguably expected) if interpolated values were treated as lazily as possible - i.e. values are only interpolated at the last possible moment, enabling the source value to be modified by other configs. Currently merging happens from the left, and expansion is performed immediately on update to a child, so updates to "parent" values do not affect later interpolations.

An example with a motivating case when this might arise:

from omegaconf import OmegaConf

config = OmegaConf.create({
    "base": {"a": 2, "b":3},
    "subtype1": "${base}"
})

subtype_tweak = OmegaConf.create({
    "subtype1": {"a":2.2}
})

user_updates = OmegaConf.create({
    "base" : {"b": 3.3}
})

merged = OmegaConf.merge(config, subtype_tweak, user_updates)

print(merged.subtype1)

# Expect (or prefer):
# {'a': 2.2, 'b': 3.3}

# Get:
# {'a': 2.2, 'b': 3}

This is also not solvable with a simple reordering - see e.g.:

config = OmegaConf.create({
    "base": {"a": 0, "base1":1, "base2":2},
    "subtype1": "${base}",
    "subtype2": "${base}"
})

tweak1 = OmegaConf.create({
    "subtype1": {"a": 0.1},
    "base": {"base1": 1.1}
})

tweak2 = OmegaConf.create({
    "subtype2" : {"a": 0.2},
    "base": {"base2": 2.1}
})

# Prefer:
#{'a': 0.1, 'base1': 1.1, 'base2': 2.1} {'a': 0.2, 'base1': 1.1, 'base2': 2.1}

# Get:
#{'a': 0.1, 'base1': 1, 'base2': 2} {'a': 0.2, 'base1': 1.1, 'base2': 2}

merged = OmegaConf.merge(config, tweak1, tweak2)
print(merged.subtype1, merged.subtype2)

# Prefer:
#{'a': 0.1, 'base1': 1.1, 'base2': 2.1} {'a': 0.2, 'base1': 1.1, 'base2': 2.1}

# Get:
#{'a': 0.1, 'base1': 1, 'base2': 2.1} {'a': 0.2, 'base1': 1, 'base2': 2}

merged = OmegaConf.merge(config, tweak2, tweak1)
print(merged.subtype1, merged.subtype2)

I believe this can be implemented in a manner still consistent with the right-to-left precedence of overwritten keys as long as there are no reference loops. A simple algorithm would partially merge configs from left-to-right while deferring and retaining any merges to interpolated values. Then interpolated value sources would be expanded one-by-one in order by topologically sorted dependency. Any deferred merges to that interpolated value would be applied left-to-right. This would proceed through the topological order until all interpolated values are expanded and updated.

I'll readily admit this is a complex utilization / desire, but if the semantics of what I describe are well-defined (as I believe they are) I think it would be preferred to the current behavior of eager expansion.

Apologies if this has been discussed elsewhere - I was unable to find anything.

Thanks!

Additional context