pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.12k stars 22.08k forks source link

[dynamo] Dynamo does not support infinite iterators (e.g., `itertools.count()`). #133879

Open XuehaiPan opened 3 weeks ago

XuehaiPan commented 3 weeks ago

Dynamo's zip does not support infinite iterators (e.g., itertools.count()).

Dynamo always realizes iterable into list items, which leads to an infinite loop. Also, fetching items from an iterator may have side effects. We should not realize the iterator into the sequence at once.

_Originally posted by @XuehaiPan in https://github.com/pytorch/pytorch/pull/133876#discussion_r1722156380_

cc @ezyang @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @amjames

XuehaiPan commented 3 weeks ago

Affected PRs:

I found we are trying to inline functions while not checking if they return infinite generators:

   BaseUserFunctionVariable.call_function
-> InstructionTranslatorBase.inline_user_function_return
-> InliningInstructionTranslator.inline_call
-> InliningInstructionTranslator.inline_call_
-> tracer.run(); ListIteratorVariable(tracer.generated_items, mutable_local=MutableLocal())

where we will fill the items from the iterator to tracer.generated_items and build a ListIteratorVariable.

https://github.com/pytorch/pytorch/blob/66d6d8b1b976e01883f832d5825b1450366cfb9f/torch/_dynamo/symbolic_convert.py#L3089-L3095

In tracer.run(), we will loop over InliningGeneratorInstructionTranslator.YIELD_FROM and push item into generated_items:

https://github.com/pytorch/pytorch/blob/432638f52177cb6ebf9bc126575d2f2206d0c970/torch/_dynamo/symbolic_convert.py#L3291-L3322

this results in an endless loop.

XuehaiPan commented 3 weeks ago

Infinite iterators are not inline-able, such as:

itertools.count()
itertools.repeat(obj, None)

while iterating on them will not causing infinity loop in eager mode:

for n in itertools.count():
    # do something
    if condition:
        break

for i, j in zip(range(256), itertools.repeat(obj, None)):
    # do something

We need a way to delay the inline process:

  1. list(itertools.count()) is not inline-able: infinite elements.
  2. list(zip(range(10, itertools.count()))) is inline-able: 10 constant elements.
ezyang commented 3 weeks ago

If we really care about this, the right thing is to just properly support generators

XuehaiPan commented 3 weeks ago

An iterator is a special generator that always send None: gen.send(None).

I think the most viable solution is to support callable iterator.

it = iter(callable, sentinel)

See also: