deshaw / pyflyby

A set of productivity tools for Python
https://deshaw.github.io/pyflyby/
Other
354 stars 51 forks source link

Tidy local imports (PyInf#11558) #273

Open ArvidJB opened 11 months ago

ArvidJB commented 11 months ago

Currently tidy-imports only optimizes global imports. How difficult would it be to also optimize "local" imports, i.e., imports within a function? Example:

$ cat foo.py
import os

def foo():
    import os
    import sys
    return os.getenv('FOO')
$ tidy-imports foo.py
[PYFLYBY] /tmp/foo.py: removed unused 'import os'
[PYFLYBY] /tmp/foo.py: added mandatory 'from __future__ import annotations'
--- /tmp/foo.py   2023-11-21 16:37:32.473138334 -0500
+++ /tmp/tmpgciyp7p6    2023-11-21 16:38:48.351697332 -0500
@@ -1,4 +1,5 @@
-import os
+from __future__ import annotations
+

 def foo():
     import os

Replace /tmp/foo.py? [y/N]

Here we should

  1. detect that import sys is unused in foo and suggest to remove it
  2. detect that import os is present both at global as well as local scope. a. I think here it's correct to remove the global import since the only usage is in a function where os is imported locally. b. But in general if there's a global import not covered by local imports everywhere we should remove the local import.

Can you estimate how much work this would be?

Carreau commented 3 months ago

I've been asked to come back to this.

My feeling with the architecture of pyflyby is that it's going to be complex, and just estimating the work would need significant work to figure out what is needed.

I'll try to look into in over the next few weeks to see what I can do, and try to come up with a plan/estimate.

ArvidJB commented 2 months ago

Hi @Carreau, do you have an estimate how much work this would be?

Carreau commented 2 months ago

I haven't been able to look into it, I'm trying to get the pyinstruments fixes first. I'll put that higher on the priority list.

Carreau commented 2 weeks ago

I'm looking into this.

detect that import os is present both at global as well as local scope. I think here it's correct to remove the global import since the only usage is in a function where os is imported locally.

I believe this is already the case; at least on some example I tried it works.

import os # will be removed; 

def foo():
    import os
    import sys

    return os.getenv("FOO")

I'll edit the original request to split request 2, into 2a and 2b.

Carreau commented 2 weeks ago

More on this also for myself once I resume the work later on.

import os # will be removed; 

def foo():
    import os
    import sys

    return os.getenv("FOO")

Taking the above example, detecting that sys is unused in foo should not be too complicated, it seem we need to patch _NewScopeCtx context manager to find all the created unused imports when exiting; I have hardcoded modification that does the job;

With the current hardcoded "Assume sys is unused"; Pyflyby now detects it (at least in debug logs) but warns that import sys is not a global import and thus cannot be removed – so this was foreseen some time ago; This appear to be due to the fact that the source-to-source transformer is not made to remove non-top-level line; I'll look into how to do that.

Removing the local import os if the global is used is though on another level of difficulty; as for a number of other issues an improvement requests there is a big limitation due to pyflyby architecture where the modifications are computed while the ast is walked; by the time we detect that the global import os is used; we already have lost any reference to the local one. This may require more heavy refactor.

I'm also a bit concerned if a local import is made to shadow a variable how this will affects the result.

import os # will be removed; 

def foo():
    os = 1
    import os # we risk marking local os import as unnecessary; but actually it shadows os = 1

    return os.getenv("FOO")

os.stat

I haven't thought on how to take care of this.

Carreau commented 4 days ago

Modifying non-local imports

I've looked into modifying non-global import more in depth.

Pyflyby SourceToSource transformer has only been designed to only handle top-level refactoring; And this is a deep assumption in pyflyby – the finest level of granularity is PythonBlock which is top-level only.

I think that updating pyflyby to allow non-top-level source updates is quite complicated; likely a multi-month project; and more of a complete rewrite than just an update.

There exists a few more recent frameworks that might allow what you wish;

In the options I don't really like;

Marking unnecessary local imports

As we talked about last Wednesday, with pyflyby architecture it's hard to prioritize local imports as we don't keep references to the ast while we are walking it; and so in the case:

import sys 

def f():
    import sys # hard to know we can remove as top-level is not yet mark as used.
    print(sys.version_info)

print(sys.argv)

Right now the importChecker keep a set of used/unused imports; I think I might be able to change that to a counter (and store a definite usage counter (sys.argv), and potential usage counter (global import sys would have a +1 potential from sys.version_info); and decide later on which import(s) are unnecessary.

Still keeping the caveat that right now it's unclear if we would be able to automatically remove the non-local imports, or which implementation choice we'd like.

I'll anyway cleanup the updates I have locally, and send a PR that will be a base for the detection of unused local imports.

Carreau commented 4 days ago

Removing non top-level import would potentially have another side-effect;

pyflyby currently make the first import in the following as unused:

if sys.version_info >= (3, 10):
    from types import NoneType, EllipsisType
else:
    NoneType = type(None)
    EllipsisType = type(Ellipsis)

As it does not understand control structures; and currently rely on the fact that it's not top-level to not remove it.

I'll have to adapt pyflyby to better track the context in which an import is marked as unused.