google / pytype

A static type analyzer for Python code
https://google.github.io/pytype
Other
4.77k stars 278 forks source link

Pytype seems confused by aliasing in some cases #616

Open juliandolby opened 4 years ago

juliandolby commented 4 years ago

Pytype seems to get confused by some patterns of assigning object variables to each other, and can generate incorrect types in some cases. Consider the following code:

a = sys.argv[1]
b = sys.argv[2]

class c1:
    a

x = c1()
x.a = "foo"
#reveal_type(x.a)

y = c1()
y.a = 7
#reveal_type(y.a)

if a == b:
  y = x

y.a = 7

print(x.a)
reveal_type(x.a)

When I run a freshly-installed Pytype in a fresh Anaconda environment, the reveal_type call in this case returns int, which is clearly wrong. If this code is run, x.a can be either "foo" or 7, depending on whether the two command line args are the same.

It looks like the y = x assignment is confusing Pytype.

rchen152 commented 4 years ago

It looks like what's happening is that at the print(x.a) line, pytype knows that there are two possible values (7 or "foo") for a, but it thinks that "foo" is never accessible because the CFG node at which 7 is assigned directly blocks the one at which "foo" is assigned. I think we could fix this by including the object on which the attribute is assigned as either an origin of the attribute value or an extra binding in the solver query.

juliandolby commented 4 years ago

@rchen152 Thanks for looking at this. I do not work on pytype, but I do work on program analysis. It seems like the issue is that 'x' and 'y' are referring to the same object at the reveal_type call, but the assignment of "foo" happens on "y" and the reveal is called on "x.f". Perhaps that is what your "origin" would handle? In program analysis, I am used to thinking of this as an aliasing issue.