python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.36k stars 2.82k forks source link

Disallow tuple unpacking for strings #6406

Closed msullivan closed 4 years ago

msullivan commented 5 years ago

I think it would be a good idea to disallow tuple unpacking from strings, so that the following is caught:

d: Dict[str, str]
for a, b in d:
    stuff

There are a lot of cases involving one or the other of string iteration and tuple unpacking that we probably can't solve without risking too many false positives, but this one should be pretty harmless.

djb4ai commented 4 years ago

Hi @msullivan I would like to take a stab at it, I am new to mypy so if you could give a little guidance.

msullivan commented 4 years ago

I think you'll want to add some sort of special case to check_multi_assignment in checker.py

djb4ai commented 4 years ago

@msullivan can you explain the example? I get it that mypy won't catch this exception but couldn't understand the tuple unpacking of string part.

JukkaL commented 4 years ago

Here is a another example:

s = "foo"
a, b = s

The tuple unpacking refers to the lvalue (a, b in the above example). If this is still unclear, it would probably be useful to study how assignment statements generally work in Python in some detail.

asmeurer commented 3 years ago

What is the mypy policy on false positives? This rule seems to restrict what is allowed for no good reason. Something like a, b, c = 'abc' is perfectly fine, and much better than a, b, c = list('abc') or a, b, c = 'a', 'b', 'c' in terms of readability.

pearu commented 3 years ago

Notice that if one has

d: Dict[Tuple[int, int], str]
for a, b in d:
    stuff

then the same reasoning would apply as in the description of this issue for disallowing tuple unpacking of tuples, that is,

a, b = 1, 2

would raise an exception. I think this would be undesired.

I think the fix to this issue is wrong because a Python string is a sequence and it should be treated as any other sequence in Python when applying indexing operations (that unpacking is). The correct fix would have been to reject the idea of disallowing unpacking of strings as it deviates from standard Python behaviour, IMHO.

AustinScola commented 3 years ago

Here is an example I just ran into where mypy reporting this error was helpful to me:

head, *tail = 'abc'

I expected to get head == 'a' and tail == 'bc'. Instead, I was surprised to find that tail == ['b', 'c'].

(Another error for tail is also reported here).

The reason I bring up this example is just because I don't see one like it on this thread.

asmeurer commented 3 years ago

Shouldn't it have caught that false assumption elsewhere where you incorrectly assumed the type of tail was str instead of list?

AustinScola commented 3 years ago

Shouldn't it have caught that false assumption elsewhere where you incorrectly assumed the type of tail was str instead of list?

Yes, mypy did. That is what I meant by "(Another error for tail is also reported here)." However, I found the tuple unpacking disallow error message more helpful for understanding my mistake. And the tuple unpacking disallow error was also reported before the other error.

asmeurer commented 3 years ago

Well the problem is that the tuple unpacking "error" isn't actually an error. It's perfectly within the semantics of Python that strings are iterables of characters, and I personally have seen and written lots of code that makes use of this fact.

desbma commented 3 years ago

Unpacking a string is disallowed

What does "disallowed" even mean here? This is perfectly valid Python code, where mypy can infer types and check them accordingly. There is no reason for mypy to generate an error on that, other than subjective coding style preferences. Or does that mean, mypy is becoming a general purpose linter and no longer a static type checker?

sobolevn commented 3 years ago

One more user has reported this: https://twitter.com/willmcgugan/status/1440673445692391445

sobolevn commented 3 years ago

My opinion:

There should be cases when unpacking string should be allowed. Examples:

There should be cases when unpacking a string should raise:

Current solution seems incosistent with other types that can be unpacked. For example, bytes are fine: a, b = b'ab'

JukkaL commented 3 years ago

We could start with disallowing string unpacking in for loops, as that doesn't seem controversial and also probably happens with some frequency.

Unpacking a string in assignment should perhaps be an opt-in check, enabled through --enable-error-code <code>. It can be implemented separately.

JukkaL commented 3 years ago

Please ignore my previous comment -- I didn't realize that this was implemented already. Anyway, we could possibly switch the assignment case to be behind --enable-error-code.

wxgeo commented 2 years ago

Is there any plan to roll this back ?

As pointed before, writing a, b, c = "abc" is perfectly valid python code, so there is no reason for mypy to report an error there.

Of course, preventing unpacking will prevent some errors for some people, just like disabling ternary operators, properties, metaclasses and so on surely would...

IMO, mypy is about typing, and not about coding guidelines or similar concerns - do one thing and do it well ;). And anyway, AFAIK, string unpacking isn't discouraged by any PEP or official documentation...

wxgeo commented 2 years ago

What might be interesting, however, is to support fixed width strings. Something like :

def f(s: Str[3]):
    a, b, c = s

Then a, b = s would be detected as incorrect.

I'm not sure that fixed width strings are so common, though.

McDic commented 4 months ago

The following code will report a mypy error:

def f(input_str: str) -> tuple[int, int]:
    assert len(input_str) == 2
    one, two = input_str  # Unpacking a string is disallowed (misc)
    return int(one), int(two)

And if you add # type: ignore[misc] then you will get an another error which complains the types of one and two are undetermined;

def f(input_str: str) -> tuple[int, int]:
    assert len(input_str) == 2
    one, two = input_str  # type: ignore[misc]
    return int(one), int(two)  # Cannot determine type of "two" (has-type)

Then you will end up something like this;

def f1(input_str: str) -> tuple[int, int]:
    assert len(input_str) == 2
    one, two = input_str  # type: ignore[misc]
    return int(one), int(two)  # type: ignore[has-type]

def f2(input_str: str) -> tuple[int, int]:
    assert len(input_str) == 2
    one: str
    two: str
    one, two = input_str  # type: ignore[misc]
    return int(one), int(two)

def f3(input_str: str) -> tuple[int, int]:
    assert len(input_str) == 2
    one, two = input_str[0], input_str[1]
    return int(one), int(two)

Above code is just an example and there would be more use-cases that uses different operations on each splitted value, which cannot be simply mapped.

None of above fixed code looks good. Maybe f3 is ok. But I believe forcing f3 style for every Python code using mypy does not make sense because str is supposed to be iterable thus unpacking is allowed.