deshaw / pyflyby

A set of productivity tools for Python
https://deshaw.github.io/pyflyby/
Other
353 stars 52 forks source link

tidy-imports doesn't include ufuncs when ran with "--replace-star-imports" (PyInf#8958) #220

Open dshivashankar1994 opened 2 years ago

dshivashankar1994 commented 2 years ago

Reproducer:

test.py

from   module                   import *
x,y = get_num_threads, heaviside

module.py

from typing import Any
import numpy
from numpy import *
heaviside: numpy.ufunc

def get_num_threads(*args, **kwargs) -> Any: ...
def set_num_threads(*args, **kwargs) -> Any: ...

test.py runs fine without tidy-imports

$ python test.py 

Ran tidy-imports with --replace-star-imports

> tidy-imports --replace-star-imports  test.py
[PYFLYBY] test.py: replaced 'from module import *' with 2 imports
[PYFLYBY] test.py: removed unused 'from module import set_num_threads'
[PYFLYBY] test.py:3: undefined name 'heaviside' and no known import for it
--- /test.py    2022-11-01 17:45:38.701542000 +0530
+++ /tmp/tmpn8t6r9e3    2022-11-01 17:46:01.109807168 +0530
@@ -1,3 +1,3 @@
-from   module                   import *
+from   module                   import get_num_threads

 x,y = get_num_threads, heaviside

Replace test.py? [y/N] y
[PYFLYBY] test.py: *** modified ***

As we can see heaviside import was never added.

> python test.py 
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    x,y = get_num_threads, heaviside
NameError: name 'heaviside' is not defined
asmeurer commented 1 year ago

I think the problem is just that heaviside is not included in the NumPy database. If I add it it adds from numpy import heaviside as expected.

asmeurer commented 1 year ago

Fixed at https://github.com/deshaw/pyflyby/pull/228. Are there any other missing ufuncs that need to be added?

dshivashankar1994 commented 1 year ago

The issue isn't specific to heaviside and is just an example.

For example, the below scenario works fine with normal functions but not with ufuncs. Our use cases doesn't include Numpy ufuncs and have added it for a simple reproducer

> cat hello.py 
def a():
   pass

def b():
    pass
> cat test_hello.py 
from   hello                    import *
aa,bb = (a,b)
> tidy-imports test_hello.py --replace-star-imports
[PYFLYBY] test_hello.py: replaced 'from hello import *' with 2 imports
--- test_hello.py   2023-01-06 03:48:00.442356472 -0500
+++ /tmp/tmp_mxysy83    2023-01-06 03:48:19.130206340 -0500
@@ -1,2 +1,2 @@
-from   hello                    import *
+from   hello                    import a, b
 aa,bb = (a,b)

Replace test_hello.py? [y/N] y
> cat test_hello.py 
from   hello                    import a, b
aa,bb = (a,b)
asmeurer commented 1 year ago

With hello.py, pyflyby can read hello.py and infer what names are exported, since it's pure Python. But for NumPy, the code is in C, so it has to rely on the import database. Here's a list of names from from numpy import * with NumPy 1.24.1 that aren't in the pyflyby import database:

ALLOW_THREADS
AxisError
BUFSIZE
CLIP
ComplexWarning
DataSource
ERR_CALL
ERR_DEFAULT
ERR_IGNORE
ERR_LOG
ERR_PRINT
ERR_RAISE
ERR_WARN
FLOATING_POINT_SUPPORT
FPE_DIVIDEBYZERO
FPE_INVALID
FPE_OVERFLOW
FPE_UNDERFLOW
False_
Infinity
MAXDIMS
MAY_SHARE_BOUNDS
MAY_SHARE_EXACT
ModuleDeprecationWarning
NINF
NZERO
PINF
PZERO
RAISE
RankWarning
SHIFT_DIVIDEBYZERO
SHIFT_INVALID
SHIFT_OVERFLOW
SHIFT_UNDERFLOW
ScalarType
TooHardError
True_
UFUNC_BUFSIZE_DEFAULT
UFUNC_PYVALS_NAME
VisibleDeprecationWarning
WRAP
_UFUNC_API
__builtins__
__version__
_add_newdoc_ufunc
_get_promotion_state
_no_nep50_warning
_set_promotion_state
add_docstring
add_newdoc
add_newdoc_ufunc
argpartition
block
broadcast_shapes
broadcast_to
busday_count
busday_offset
busdaycalendar
bytes_
cast
cbrt
char
complex256
compress
copyto
count_nonzero
datetime64
datetime_as_string
datetime_data
delete
deprecate
deprecate_with_doc
divmod
e
einsum_path
errstate
euler_gamma
finfo
flexible
flip
float128
float16
float_power
format_float_positional
format_float_scientific
format_parser
from_dlpack
full
full_like
gcd
generic
genfromtxt
geomspace
get_array_wrap
get_include
get_printoptions
getbufsize
geterr
geterrcall
geterrobj
half
heaviside
histogram_bin_edges
info
infty
insert
is_busday
isclose
isin
isnat
lcm
load
lookfor
ma
math
matmul
maximum_sctype
may_share_memory
min_scalar_type
moveaxis
nancumprod
nancumsum
nanmean
nanmedian
nanpercentile
nanprod
nanquantile
nanstd
nanvar
nditer
nested_iters
obj2sctype
pad
partition
percentile
positive
printoptions
promote_types
put
put_along_axis
quantile
ravel_multi_index
rec
require
result_type
safe_eval
save
savetxt
savez
savez_compressed
sctype2char
sctypeDict
sctypes
set_numeric_ops
set_printoptions
set_string_function
seterr
seterrcall
seterrobj
shares_memory
show_runtime
source
stack
take_along_axis
timedelta64
tracemalloc_domain
typecodes
typename
who
dshivashankar1994 commented 1 year ago

In initial update, I have the type hint added

from numpy import *
heaviside: numpy.ufunc

Can we make pyflyby understand that heaviside is being used and make the --replace-star-imports consider this ?