thebjorn / pydeps

Python Module Dependency graphs
BSD 2-Clause "Simplified" License
1.73k stars 111 forks source link

The cycles display mode shows useless information and does not work properly for larger projects. #197

Open ArturLew opened 1 year ago

ArturLew commented 1 year ago

Hi there!

The project looks promising and I wanted to use it to find a cyclic loop when importing modules in my project. Unfortunately it only works for trivial cases :(

For more complex cases (like mine) the result generated from the application is purely random - not only was the application unable to correctly locate the source of the problem for me, but it generates a different dependency graph each time.

I tested on: Python 3.11.2 and Python 3.11.5 Platform: Windows 11

Used command line:

pydeps --show-cycles --no-config app

For three calls - tree different outputs: app1 app3 app4

thebjorn commented 1 year ago

Hi @ArturLew and thank you for your interest in pydeps.

The problem you're seeing might be because the graph is pruned too much before cycle analysis is run..? Could you try adding --max-bacon=5 to the command line?

This problem doesn't show up on the examples I have in the testsuite, so I'd likely need something to test against to investigate it further...

yusiyoh commented 10 months ago

Similar issue here. It generates different diagrams each run for --show-cycles. Unfortunately, I cannot share the diagrams.

thebjorn commented 10 months ago

See my previous comment. Without being able to reproduce the problem it is unlikely that I'll be able to fix it.

hartwork commented 4 months ago

The problem you're seeing might be because the graph is pruned too much before cycle analysis is run..? Could you try adding --max-bacon=5 to the command line?

@thebjorn thanks for the tip on --max-bacon!

See my previous comment. Without being able to reproduce the problem it is unlikely that I'll be able to fix it.

I would like to confirm…

I have yet to find a value for --max-bacon that does not produce any non-circle islands with --show-cycles, even 30 still includes islands.

PS: With regard to the non-determinism, my guess would be that GraphViz input is changing and that adding sorted(..) at a few places may fix the issue. I had the same with https://github.com/git-big-picture/git-big-picture/pull/398 .

wnpp_debian_net__bacon_5

thebjorn commented 4 months ago

Hi @hartwork and thank you for the test case. Very useful :-)

Adding sorted in a few places improves things, at least to the point where I'm convinced the find_import_cycles method (in depgraph.py) is not correct. It looks like it is doing a DFS and looking for back-edges (it's been nine years since I wrote this code...), and then adding any found cycles to self.cyclerelations. I'm not sure where the bug is yet though.

hartwork commented 4 months ago

@thebjorn thanks for your positive response! I'm a bit too swamped to have a closer look at the code myself right now, but great to know that there is interest in getting this improved, would be great :+1:

emilienDespres commented 1 month ago

Hello, first of all thanks for the project, I find it quite useful.

If I understand correctly the following looks a bit excessive in find_import_cycles

if node.name in self.cyclenodes:
                return

Let's say we have two cycle A<->B and B<->C:

Starting traverse on A, we detect A<->B,

Then starting traverse on C (or B) will stop on B without detecting the B<->C as B is already marked, although for another cycle.

Maybe it's only part of the problem as it should only affect intersecting cycles, so I don't see how it would create island.

Not sure if I will be able to make a fix. I'll try.

thebjorn commented 1 month ago

Hi @emilienDespres , I think I got to the bibliography from this paper "On Algorithms for Enumerating All Circuits of a Graph" (https://epubs.siam.org/doi/10.1137/0205007). As far as I remember the algorithm by Donald Johnson (https://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF) was the most efficient generally, although I don't know if the import graph is a special case where an other (simpler?) algorithm would suffice.

I'm in the middle of a huge project so I won't have any time to implement this for a while, but I'm always happy to merge a PR.