PyCQA / isort

A Python utility / library to sort imports.
https://pycqa.github.io/isort/
MIT License
6.45k stars 574 forks source link

isort can break code #2106

Open mic-e opened 1 year ago

mic-e commented 1 year ago

One of my projects broke in a subtle manner after I ran isort. Minimal example:

from asyncio import subprocess
import subprocess
print(subprocess.check_call)

Another example:

import sys as a
import os as a
print(a.listdir())

Tested with both isort 4.3.4 and isort 5.12.0.

This is an issue which could be easily detected. isort could then print a warning, exit with an error code or refuse to touch the file.

Of course, the given examples do not make sense and are flagged by pylint, but still are valid Python code which runs without errors - at least until you run them through isort.

asaf-kali commented 1 year ago

Just faced the same issue recently, with:

from datetime import time
import time

After isort the import order is reversed, and time is now from datetime. Of course the code did not make sense in the first place, but when working on large, old, not fully-covered repositories, this can cause issues.

Detecting those situations and printing a warning about it would be a great stability improvement.

yozachar commented 1 year ago

Correct me if I'm wrong, isort isn't interested in namespace as pylint, flake or pyright and others are. All it intends to do is sort imports. It does not check if you're shadowing names. Even though the latter might be useful in very particular cases, it's generally discouraged. You can use the as keyword to rename the imports instead.

from asyncio import subprocess as async_sp
import subprocess

async def example():
    await async_sp.create_subprocess_shell("")

subprocess.run("")

isort is working as designed, but you've a choice to improve code.

asaf-kali commented 1 year ago

Writing better code is not the point. The point is that given a certain state of working code, running isort can break it, without you knowing about it - so it is in the scope of isort. Let me explain, given the code:

from datetime import time
import time
now = time.time()

This is valid python code. It may not be very good (because of shadow import), but it works. After running isort, you get:

import time
from datetime import time
now = time.time()

Which raises an error (AttributeError: type object 'datetime.time' has no attribute 'time'). In a repo with thousands of files, this can cause serious issues (assuming you are not at 100% coverage), making it dangarous to start using isort on production-level code.

mic-e commented 1 year ago

Some background information:

of course, the code that was broken by isort in my case was badly-written (due to the unused shadowed import), should have and has been fixed.

The error was originally introduced by VSCode which automatically added the useless import unasked and without giving notification. The extra import slipped through a code review (an import of something asyncio-related is not unusual in the codebase), and the code continued working for many months.

Then the codebase was formatted with black and isort. The re-ordering if imports introduced an error into a rarely-used error-handling branch which was sadly not covered by unit tests.

By sheer luck it was discovered in the last stages of integration testing.

Of course I understand that re-ordering of imports can always change the behavior of code, especially in the presence of badly-written modules, and testing this is equivalent to the halting problem.

But a check for reordering of shadowing imports would be fairly simple to do for isort and would catch >99% of the cases where it might break working code.

yourcelf commented 9 months ago

The problem isn't limited to accidentally importing the same identifier twice. I ran into an issue with something much more innocuous looking with datetime.datetime:

This works:

from datetime.timezone import utc
from datetime import datetime

but is rearranged to

from datetime import datetime
from datetime.timezone import utc

which fails with ModuleNotFoundError: No module named 'datetime.timezone'; 'datetime' is not a package. The issue here is that the datetime class identifier supercedes the datetime module identifier when doing a subsequent dotted import.

Confusingly, this succeeds -- it seems the extra dotted lookup changes how python dereferences the module:

from datetime import datetime
from datetime import timezone

At least this example fails fast at module import time, rather than having subtly changed behavior once it's running.