Technologicat / pyan

Static call graph generator. The official Python 3 version. Development repo.
GNU General Public License v2.0
324 stars 57 forks source link

RuntimeError: dictionary changed size during iteration #18

Closed pmneve closed 3 years ago

pmneve commented 4 years ago

Getting this on a specific directory (not seeing it elsewhere) What do you need to dig into this?

Traceback (most recent call last): File "/Users/nevep/.pyenv/bin/versions/3.7.4/envs/egs-api/bin/pyan3", line 7, in exec(compile(f.read(), file, 'exec')) File "/Users/nevep/bepress/pyan/pyan3", line 11, in sys.exit(main()) File "/Users/nevep/bepress/pyan/pyan/main.py", line 109, in main v = CallGraphVisitor(filenames, logger) File "/Users/nevep/bepress/pyan/pyan/analyzer.py", line 77, in init self.process() File "/Users/nevep/bepress/pyan/pyan/analyzer.py", line 87, in process self.postprocess() File "/Users/nevep/bepress/pyan/pyan/analyzer.py", line 154, in postprocess self.collapse_inner() File "/Users/nevep/bepress/pyan/pyan/analyzer.py", line 1562, in collapse_inner for name in self.nodes: RuntimeError: dictionary changed size during iteration

zh2k3ang commented 4 years ago

I also encountered this error. Maybe it's because line 1307 in get_node, sometimes it enlarges the dictionary.

pliablepixels commented 4 years ago

Have the same issue - did you find a work around?

Technologicat commented 4 years ago

Good catch. I agree with @zkkzhangkangkang's static analysis, get_node seems to be the only place called by collapse_inner that could cause this.

The reason then must be that the parent node being looked up by get_parent_node does not already exist - and no other nodes with the same short name exist, since a new entry is being added to self.nodes, changing the dictionary size. This shouldn't happen for the use case of collapse_inner, but apparently does. :)

For testing, do any of you have a minimal working example, or a public project, that reproduces the bug? This would help me see why the parent node isn't getting defined.

As a fallback solution, it is possible to postpone the update of self.nodes until the loop in collapse_inner has completed, but that complicates the code, and makes the implementation hard to explain (so by ZoP §17, probably a bad idea). We would need something like this pattern:

def get_node(..., stash=None):
    ...
    target = stash or self.nodes
    if name in target:
        target[name].append(n)
    else:
        target[name] = [n]

    return n

...

def collapse_inner(...):
    stash = {}
    def unstash():
        nonlocal stash
        for name in stash:
            if name in self.nodes:
                self.nodes[name].extend(stash[name])
                # TODO: probably need to drop duplicates - list(set(...))?
            else:
                self.nodes[name] = stash[name]
        stash = {}
    handled = set()
    while True:
        for name in self.nodes:
            if name in ('lambda', 'listcomp', 'setcomp', 'dictcomp', 'genexpr'):
                for n in self.nodes[name]:
                    if n not in handled:
                        ....
                        # use the stash option here
                        handled.add(n)
        if not stash:
            break
        unstash()
pliablepixels commented 4 years ago

@Technologicat, I was using a large DNN code base, but here is one minimal set that causes this. It seems a combination of __init__.py and this additional file causes the error.

Notes: a) If I rename __init__.py to a.py it works b) If I change the 2nd argument of f.py to not involve the for iteration it works c) The example is nonsensical - I just made sure it runs

example.zip

command and output:

find . -iname "*.py" | xargs pyan3  --colored --no-defines
Traceback (most recent call last):
  File "/Users/pp/anaconda3/envs/ml/bin/pyan3", line 11, in <module>
    sys.exit(main())
  File "/Users/pp/anaconda3/envs/ml/lib/python3.7/site-packages/pyan/main.py", line 109, in main
    v = CallGraphVisitor(filenames, logger)
  File "/Users/pp/anaconda3/envs/ml/lib/python3.7/site-packages/pyan/analyzer.py", line 77, in __init__
    self.process()
  File "/Users/pp/anaconda3/envs/ml/lib/python3.7/site-packages/pyan/analyzer.py", line 87, in process
    self.postprocess()
  File "/Users/pp/anaconda3/envs/ml/lib/python3.7/site-packages/pyan/analyzer.py", line 154, in postprocess
    self.collapse_inner()
  File "/Users/pp/anaconda3/envs/ml/lib/python3.7/site-packages/pyan/analyzer.py", line 1562, in collapse_inner
    for name in self.nodes:
RuntimeError: dictionary changed size during iteration
pliablepixels commented 4 years ago

Another, more legitimate example that is equally nonsensical, but is structured correctly. Same issue. pyan.zip

Technologicat commented 4 years ago

@pliablepixels As you may have guessed, I'm busy right now, but I'll have a look as soon as I have time. Having an example that reproduces the bug should make this much easier. Thanks for the MWE!

theotheo commented 4 years ago

It looks like this PR fixes the issue: https://github.com/Technologicat/pyan/pull/24/files

Technologicat commented 3 years ago

Yes, it should work now.

(collapse_inner now iterates over 'list(self.nodes), notself.nodes`.)