thebjorn / pydeps

Python Module Dependency graphs
BSD 2-Clause "Simplified" License
1.73k stars 111 forks source link

Deterministic output #56

Closed pawamoy closed 1 year ago

pawamoy commented 4 years ago

The generated file changes each time I run pydeps again. Would it be possible to make the resulting SVG the same each time, as long as the source is the same as well?

thebjorn commented 4 years ago

I can't reproduce that behavior. Which version of pydeps? Do you have a small testcase?

pawamoy commented 4 years ago

pydeps v1.9.0!

I just ran this in pydeps own repository:

for i in 1 2 3 4 5; do
  pydeps pydeps --noshow --only pydeps -o pydeps.$i.svg
done

Here are the resulting SVGs thumbnails:

Screenshot_2020-04-30_20-25-27

As you can see they all slightly differ from each other :slightly_smiling_face:

I found some information in dots manual page, though I'm not sure which applies:

(neato‐specific attributes) ... start=val. Requests random initial placement and seeds the random number generator. If val is not an integer, the process ID or current time is used as the seed.

(fdp‐specific attributes) ... start=val. Adjusts the random initial placement of nodes with no specified position. If val is is an integer, it is used as the seed for the random number generator. If val is not an integer, a random system‐generated integer, such as the process ID or current time, is used as the seed.

If one of those start value applies, maybe we could get a deterministic output if we used the same seed each time?

thebjorn commented 4 years ago

Interesting, they're all exactly the same on Windows (and on WSL).

I used this, which should be similar:

python -c "import os;[os.system('pydeps pydeps --noshow --only pydeps -o pydeps.{}.svg'.format(i)) for i in range(5)]"

I'm afraid neither the neato nor the fdp layout engines are relevant since pydeps is using the dot layout engine, but based on this tweet: https://twitter.com/Graphviz/status/1039632469782396929 a solution might be to change this line in the dot function in pydeps/pydeps/dot.py (L60), from:

cmd = "dot -T%s" % kw.pop('T', 'svg')

to

cmd = "dot -Gstart=1 -T%s" % kw.pop('T', 'svg')

since I can't reproduce the problem I'll need someone else to test it (..and perhaps create a PR?)

pawamoy commented 4 years ago

I tried this option but it didn't work. I also tried -Gcenter=1, different layouts and combinations of option, but no luck, the resulting SVG is never quite the same.

My graphviz version is dot - graphviz version 2.44.0 (0), and I'm on ArchLinux.

kinow commented 4 years ago

Never noticed the output was not deterministic. Tried the onliner above on a small (3 files) project from $work, and while the layout was the same, in a couple the oval shapes were misplaced.

Tried on Tornado, and the 5 images were different than each other. Tried the -Gstart change (modified the dot.py from my venv/lib/python/site-packages/...). First it failed to plot because of my typo (forgot the dash before Gstart). Then when I ran the second time it was still giving different layouts.

My env is Ubuntu LTS, Python Anaconda 3.7, dot - graphviz version 2.40.1 (20161225.0304).

thebjorn commented 4 years ago

Hmm.. looks like this is a Python 3.x issue. I cannot reproduce with Py2.7 on any platform or graphviz configuration. Reproduced with Py3.6 + graphviz 2.40.1 on Ubuntu 18.04.4; and Py3.5 + graphviz 2.38.0 on Win10.

thebjorn commented 4 years ago

Hmm.. the rule ordering in the dot file is also non-deterministic on Py3. We might get away with simply sorting the rules...?

pawamoy commented 4 years ago

I'd be happy to try that out but I don't understand what you mean. By "rules" you mean the text input you feed to the dot command?

thebjorn commented 4 years ago

@pawamoy I'm mostly just thinking out loudly - being able to reproduce the issue makes everything easier ;-) See the latest checkin (https://github.com/thebjorn/pydeps/commit/58bfcd782a2be11885d944cd322336ff6ec2120d) for what I'm talking about. I'm pretty sure that if the dot source I generate is deterministic, then the resulting graphs will be as well. With the latest change we're almost there (some runs add a weight=5 to some relations that cause certain nodes to shift).

pawamoy commented 4 years ago

Nice! It's working well, I got the same result for over 100 runs! This is great, thanks :slightly_smiling_face:

thebjorn commented 4 years ago

I've just release v1.9.1 which fixes this. Thanks for all the help.

greenled commented 2 years ago

@thebjorn somehow I'm getting this issue. Nodes get random order, and some times the amount of nodes are different. I'm using pydeps v1.10.22, Python 3.10.6, Graphviz 2.43.0 and Ubuntu 20.04.

thebjorn commented 2 years ago

Hi @greenled , I can't reproduce this with py3.10.2, pydeps 1.10.22 and graphviz 2.50.0 on windows 11, running pydeps on pydeps source. Do you have a testcase?