Different dependency name with `import io` and `from io import StringIO`

and3rson commented 1 year ago

Describe the bug It seems that the behavior of determining dependency names differs based on the import style used.

Example project:

# foobar/services/stuff1.py
import io
io.foo()

# foobar/services/stuff2.py
from io import StringIO
StringIO.bar()

This yields four nodes with the following dependencies (using graphviz syntax for demo purposes here):

digraph {
  stuff1 -> io
  stuff2 -> foobar/io
}

Logs:

2022-10-09 20:02:01     parser D ⏩ generating file results...
2022-10-09 20:02:01     parser D ⏩ extracting imports from file result stuff2.py...
2022-10-09 20:02:01     parser D ⏩ adding import: foobar/io
2022-10-09 20:02:01     parser D ⏩ generating file results...
2022-10-09 20:02:01     parser D ⏩ extracting imports from file result stuff1.py...
2022-10-09 20:02:01     parser D ⏩ adding import: io

I did some introspection on pyparser.py and it seems that:

from io import StringIO form sets global_import to True (due to the from keyword being used)
import io sets global_import to False This leads to inconsistency of final names because the condition on line 260 requires global_import to be False.

Describe your environment

Are you using the tool on macOS or linux? - Linux
Which Python version are you using? - Python 3.9.13
Which browser (with version) are you using? - N/A

To Reproduce Steps to reproduce the behavior:

Create project "foobar"
Import module xxx in two different files using import xxx and from xxx import yyy foms
Run the tool
See xxx and foobar/xxx listed as two separate dependencies

Expected behavior In both cases, only xxx should be shown.

Screenshots

glato commented 1 year ago

@and3rson Thank you for the detailed and very helpful description and steps to reproduce 👍. Will have a look at this issue in the following days and hopefully provide a fix.

glato commented 1 year ago

@and3rson The issue you found is pretty interesting in the sense that one can maybe reduce it to the problem of trying to decide (no matter of the import syntax) if you're dealing with an import from pythons own libraries. So my approach/fix does exactly this, in checking if the dependency can be found in sys.modules and if so - setting global_import to True. This seems to solve the reproducible example you've described above (see my screenshot). Also tried it on a bigger codebase (Django), as one would assume it reduces nodes and edges.

Your feedback would be great, you can find the quick fix in PR #30.

glato / emerge

Different dependency name with `import io` and `from io import StringIO` #29