thebjorn / pydeps

Python Module Dependency graphs
https://pydeps.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
1.8k stars 114 forks source link

Different types of imports get disconnected. #224

Open luketych opened 5 months ago

luketych commented 5 months ago

I have a project that combines two different import methods:

import a from b

and import c

I am just making up this example, and will show more details if necessary.

I am using pydeps . --include-missing

Here is the generated graph.

Screenshot 2024-05-29 at 8 29 13 PM
thebjorn commented 5 months ago

I'm not sure what you are asking...? Generally, you'll need to

  1. show what you are doing
  2. show what is happening
  3. describe what you expected to happen instead
  4. if you think it is a bug, you will also need to provide the smallest possible sample that I can run that demonstrates the problem (e.g. it can't contain source that I cannot access).

1-3 lets me know if your understanding and usage is reasonable, and 4 makes it possible for me to debug the issue and provide a solution.

luketych commented 5 months ago

It's a bit of a weird example, but I am trying to demonstrate that I can use pydeps to detect circular imports. I am expecting the generated diagram to show the circular dependency, where func1 and func2 would be caught in a circle attempting to import each other via the importer.

I am mixing importing directly, ie

➜  src git:(develop) ✗ ls

__init__.py __pycache__ func1.py    func2.py    importer.py index.py 
 ➜  src git:(develop) ✗ cat importer.py

def importer(module, function):
    module = __import__(module)
    return module.__dict__[function]
➜  src git:(develop) ✗ cat index.py

import importer

func1 = importer.importer('func1', 'func1')

if __name__ == '__main__':
    func1()
➜  src git:(develop) ✗ cat func1.py

import importer

func2 = importer.importer('func2', 'func2')

def func1():
    print("func1 in func1.py")

    func2()
➜  src git:(develop) ✗ cat func2.py

import importer

func1 = importer.importer('func1', 'func1')

def func2():
    print("func2() in func2.py")

    func1()
thebjorn commented 5 months ago

If we look at your importer function:

def importer(module, function):
    module = __import__(module)
    return module.__dict__[function]

and display the corresponding byte code:

>>> import importer
>>> import dis
>>> dis.dis(importer.importer)
  2           0 LOAD_GLOBAL              0 (__import__)
              2 LOAD_FAST                0 (module)
              4 CALL_FUNCTION            1
              6 STORE_FAST               0 (module)

  3           8 LOAD_FAST                0 (module)
             10 LOAD_ATTR                1 (__dict__)
             12 LOAD_FAST                1 (function)
             14 BINARY_SUBSCR
             16 RETURN_VALUE
>>>

module = __import__(module) correspond to instructions labeled 0-6, and return module.__dict__[function] are the byte codes labeled 8-16. There are no concrete module and function names present, just the parameter names of the importer function.

pydeps doesn't run your program, so it has no idea what the function parameter will be bound to at runtime. This kind of dynamic imports makes it impossible for any static analyzer to figure out what the actual import tree looks like.

Yes, it would be theoretically possible, e.g. using dataflow analysis and whole-program analysis, but pydeps (and the Python modulefinder module) use a very simple (but fast) technique for finding imports.

If you look at the bytecode for a "normal" import:

>>> def foo():
...     import math
...
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (0)
              2 LOAD_CONST               0 (None)
              4 IMPORT_NAME              0 (math)
              6 STORE_FAST               0 (math)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>>

You'll see the IMPORT_NAME instruction, and its parameter math. When seeing this pydeps knows that the current module imports a module named math.

Pydeps can unfortunately, but by design, not follow dynamic imports.