Open ArvidJB opened 11 months ago
I've been asked to come back to this.
My feeling with the architecture of pyflyby is that it's going to be complex, and just estimating the work would need significant work to figure out what is needed.
I'll try to look into in over the next few weeks to see what I can do, and try to come up with a plan/estimate.
Hi @Carreau, do you have an estimate how much work this would be?
I haven't been able to look into it, I'm trying to get the pyinstruments fixes first. I'll put that higher on the priority list.
I'm looking into this.
detect that import os is present both at global as well as local scope. I think here it's correct to remove the global import since the only usage is in a function where os is imported locally.
I believe this is already the case; at least on some example I tried it works.
import os # will be removed;
def foo():
import os
import sys
return os.getenv("FOO")
I'll edit the original request to split request 2, into 2a and 2b.
More on this also for myself once I resume the work later on.
import os # will be removed;
def foo():
import os
import sys
return os.getenv("FOO")
Taking the above example, detecting that sys
is unused in foo should not be too complicated,
it seem we need to patch _NewScopeCtx
context manager to find all the created unused imports when exiting; I have hardcoded modification that does the job;
_NewScopeCtx
is used; this should be a few more hours 5-8h, so likely done by next week.With the current hardcoded "Assume sys is unused"; Pyflyby now detects it (at least in debug logs) but warns that import sys
is not a global import and thus cannot be removed – so this was foreseen some time ago; This appear to be due to the fact that the source-to-source transformer is not made to remove non-top-level line; I'll look into how to do that.
Removing the local import os
if the global is used is though on another level of difficulty;
as for a number of other issues an improvement requests there is a big limitation due to pyflyby architecture where the modifications are computed while the ast is walked; by the time we detect that the global import os
is used; we already have lost any reference to the local one. This may require more heavy refactor.
I'm also a bit concerned if a local import is made to shadow a variable how this will affects the result.
import os # will be removed;
def foo():
os = 1
import os # we risk marking local os import as unnecessary; but actually it shadows os = 1
return os.getenv("FOO")
os.stat
I haven't thought on how to take care of this.
I've looked into modifying non-global import more in depth.
Pyflyby SourceToSource transformer has only been designed to only handle top-level refactoring; And this is a deep assumption in pyflyby – the finest level of granularity is PythonBlock
which is top-level only.
I think that updating pyflyby to allow non-top-level source updates is quite complicated; likely a multi-month project; and more of a complete rewrite than just an update.
There exists a few more recent frameworks that might allow what you wish;
libcst
, and make pyflyby work on top of it; This is still a multi-month rewrite but we don't reinvent the wheel. In the options I don't really like;
xxx
is unused; can the file; and we are close to the correct line number and find ^\W+import\W+xxx$
, simply remove the line from the text source.from foo import a, b, xxx
, or multiline imports, but we can try to refine the heuristic.As we talked about last Wednesday, with pyflyby architecture it's hard to prioritize local imports as we don't keep references to the ast while we are walking it; and so in the case:
import sys
def f():
import sys # hard to know we can remove as top-level is not yet mark as used.
print(sys.version_info)
print(sys.argv)
Right now the importChecker keep a set of used/unused imports; I think I might be able to change that to a counter (and store a definite usage counter (sys.argv), and potential usage counter (global import sys
would have a +1 potential from sys.version_info
); and decide later on which import(s) are unnecessary.
Still keeping the caveat that right now it's unclear if we would be able to automatically remove the non-local imports, or which implementation choice we'd like.
I'll anyway cleanup the updates I have locally, and send a PR that will be a base for the detection of unused local imports.
Removing non top-level import would potentially have another side-effect;
pyflyby currently make the first import in the following as unused:
if sys.version_info >= (3, 10):
from types import NoneType, EllipsisType
else:
NoneType = type(None)
EllipsisType = type(Ellipsis)
As it does not understand control structures; and currently rely on the fact that it's not top-level to not remove it.
I'll have to adapt pyflyby to better track the context in which an import is marked as unused.
Currently tidy-imports only optimizes global imports. How difficult would it be to also optimize "local" imports, i.e., imports within a function? Example:
Here we should
import sys
is unused infoo
and suggest to remove itimport os
is present both at global as well as local scope. a. I think here it's correct to remove the global import since the only usage is in a function whereos
is imported locally. b. But in general if there's a global import not covered by local imports everywhere we should remove the local import.Can you estimate how much work this would be?