omry / omegaconf

Flexible Python configuration system. The last one you will ever need.
BSD 3-Clause "New" or "Revised" License
1.98k stars 112 forks source link

Discussion: policy on cycles in configs #661

Closed Jasha10 closed 3 years ago

Jasha10 commented 3 years ago

Question: If an operation on cyclic configs can cause the program to crash, how should we handle this? We could:

Background / Motivation:

Recently I discovered that calling x = OmegaConf.create({'x': '${}'}); x == x causes the python process to crash on my computer:

>>> x = OmegaConf.create({'x': '${}'}); x == x
Fatal Python error: _Py_CheckRecursiveCall: Cannot recover from stack overflow.
Python runtime state: initialized

Current thread 0x00007f4935aa4740 (most recent call first):
  File "/home/jbss/miniconda3/envs/nn/lib/python3.9/abc.py", line 98 in __instancecheck__
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 230 in _get_root
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 344 in _resolve_key_and_root
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 562 in _resolve_node_interpolation
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 641 in node_interpolation_callback
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar_visitor.py", line 157 in visitInterpolationNode
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 891 in accept
  File "/home/jbss/miniconda3/envs/nn/lib/python3.9/site-packages/antlr4/tree/Tree.py", line 34 in visit
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar_visitor.py", line 134 in visitInterpolation
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 816 in accept
  File "/home/jbss/miniconda3/envs/nn/lib/python3.9/site-packages/antlr4/tree/Tree.py", line 34 in visit
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar_visitor.py", line 102 in <listcomp>
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar_visitor.py", line 102 in visitConfigValue
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 203 in accept
  File "/home/jbss/miniconda3/envs/nn/lib/python3.9/site-packages/antlr4/tree/Tree.py", line 34 in visit
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 673 in resolve_parse_tree
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 454 in _resolve_interpolation_from_parse_tree
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/base.py", line 219 in _dereference_node
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/basecontainer.py", line 637 in _item_eq
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 668 in _dict_conf_eq
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 562 in __eq__
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 566 in __ne__
... (repeated many times)
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 562 in __eq__
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 566 in __ne__
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/basecontainer.py", line 647 in _item_eq
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 668 in _dict_conf_eq
  File "/home/jbss/mkt_learn/py/omegaconf.git/omegaconf/dictconfig.py", line 562 in __eq__
  ...
Aborted (core dumped)

Duping the core and exiting the python process is a very serious error state. This makes me think that we should consider adding more cycle detection / prevention logic when such bugs appear, so that these become recoverable errors.

omry commented 3 years ago

I think Python segfaulting is strictly a bug in the Python interpreter. Trying to detect cycles is very complicated and is a losing game, there are too ways for cycles to manifest.

This bug specifically should be easy to fix, the interpolation ${} seems illegal to begin with so we can fail early on it.

iirc, #602 was preventing an infinite loop which is worse than a recursion error and was relatively easy to prevent. In general, I am not opposed to improvements like that but I feel that cycle detection only to issue an alternative error to Stack Overflow is not worth the investment.

omry commented 3 years ago

Edited the comment ^

Jasha10 commented 3 years ago

Ok, sounds good.

I'll close this issue and open another one r.e. failfast for the ${} interpolation.

omry commented 3 years ago

@odelalleau, happy to hear your thoughts on this.

odelalleau commented 3 years ago

@odelalleau, happy to hear your thoughts on this.

When working on #602 I initially wanted to make it more generic so as to catch more infinite recursion situations, however it proved to be quite tricky because there can be long chains of function calls causing recursions to the parent, and trying to keep track of the visited nodes along the way seemed too painful to be worth it.

But if someone can come up with a simple and efficient solution, then great, go ahead!

omry commented 3 years ago

Another difficulty is that when using dunder methods like __eq__ you can't change the function signature to pass additional state. This makes it impossible to pass state cleanly, and would require some kind of per object / per thread state.