fastapi / typer

Typer, build great CLIs. Easy to code. Based on Python type hints.
https://typer.tiangolo.com/
MIT License
15.68k stars 667 forks source link

[BUG] `envvar` incorrect type: click accepts `Sequence[str]` #283

Open lmmarsano opened 3 years ago

lmmarsano commented 3 years ago

https://github.com/tiangolo/typer/blob/c3a4c72a4f1a18b7f16addb508862ec63411fa63/typer/params.py#L14

Why List? Do we expect to mutate the data? I'd expect immutable types allowed. Maybe this should be generalized to an abstract interface such as Sequence or Iterable. In general, any overspecialized parameters should be generalized.

kkirsche commented 3 years ago

Can you elaborate on Sequence here, as I'm not super familiar with that type? As I read this, I read the List[str] as a definition of the intent of the function. If using Iterable[str], you can pass a str in as a string is iterable that returns a string. (X-Ref: https://github.com/python/typing/issues/256)

lmmarsano commented 3 years ago

Types are for more than intent. They're used for static type checking. Try out mypy or pyright. They'll throw errors if you try to pass a tuple as a list, since lists permit operations tuples forbid. To get the most out of typed functions, you'd declare parameters with the most inclusive types possible that permit only the operations your function actually needs, and return the most specific type the function actually returns. This way, the type system allows the caller to pass in anything that will work, even the bare minimum needed, and the type system still allows the caller to do everything the returned values actually allow. The most inclusive types are typically abstract interfaces, and the most specific are concrete implementations. Type annotations rejecting good parameters indicate incorrect annotations.

sathoune commented 3 years ago

I think what @kkirsche meant was that a string abcd is a valid Iterable that you can loop through:

for x in "abcd":
   print(x)

would print:

a
b
c
d

also isinstance checks are valid on a string:

from collections import Sequence, Iterable

print(isinstance("abc", Collection))
print(isinstance("abc", Sequence))

would both be True and that is not what is intended for the envvar. A single string would mean there is only one variable passed, not to iterate over it thus the Union type for envvar and probably some check for the list in the code.

lmmarsano commented 3 years ago

Are we drawing conclusions that don't follow? Consider

import collections.abc as CA
def normalize(input: CA.Iterable[str]) -> tuple[str, ...]:
  return (input,) if isinstance(input, str) else tuple(input)

Nothing in the general type annotation forces the implementation to iterate over each character in a string: the implementation is free to do anything that meets type requirements.

Yes, str is a free monoid, so CA.Iterable[str] = Union[str, CA.Iterable[str]], and the latter would be kinder to the user. However, I wouldn't declare input: Union[str, list[str]] unless I want to be unkind to the user (& myself).

Now inspect the implementation. What operations does it actually use? What operations does it need? Does it actually mutate the envvar list? If not, then it could be tuple. Does it actually use index operations? If not, then it doesn't need to be a tuple either. In fact, I'm not seeing the code here do much with envvar other than pass it to click, and according to the documentation there, it doesn't need list.

kkirsche commented 3 years ago

I think we have very different definitions on the purpose of types, which is fine. It sounds as though you operate from the pure type perspective, meaning the purpose of types is to document what a function could operate on, influenced from languages like Haskell. This is perfectly valid, it simply differs from what I’ve encountered and find in code of my peers. In practice, I find that teams I’ve worked with use types and type checking as extended documentation. Abstract typing can be valuable, but if I want the user to pass a list, I document a list even though it could operate on other types. Both can be valid, but they aren’t really compatible philosophies.

Agree to disagree on their purpose. I appreciate you taking the time to explain.

lmmarsano commented 3 years ago

python/typing#256: the type hint Union[str, List[str]] already accepts str, so generalizing list to Sequence or Iterable isn't a problem. That discussion only points out we can't exclude str from Iterable[str] or Sequence[str], which is fine. Type hints don't change implementation, and this code already exists. The request is to improve accuracy.

Documentation: this is an admirable goal that I fully support. Wouldn't types that more accurately specify requirements behavior improve documentation? Declaring requirements with precise interfaces is entirely compatible with good documentation.

envvar: Optional[Union[str, Sequence[str]]] = None

seems clear in documentation without placing contrived, counterproductive limitations on implementers using this library. As was pointed out, even click documents envvar as Sequence despite not using type hints.

Notice this discussion only exists because unlike documentation, type hints can restrict programs from validating. If it were only a disagreement over documentation, then a programmer who knew better could just call this function any valid way Python's dynamic type system allows without debate, and the valid program will run successfully. Here, a deficient choice of type declarations prevents valid programs from validating. That's no longer a difference in "philosophy", or a matter programmers should readily "agree to disagree" on. Poorly chosen type hints create a false dilemma: now the programmer must choose between writing suboptimal code to appease ill-conceived type hints, or to write their optimal code and override the type system (possibly introduce error) or go blind (possibly introduce much error). That is unnecessary & counterproductive. Type systems in programming languages have existed since conception to prevent type errors and validate correct programs.

If someone is going to claim typing by subjective "intent" is valid, and that "intent" unnecessarily classifies valid Python code as invalid to static type-checkers, then they better be ready to justify their intent. Why is list better than any list-like data? Do you know better than the programmer the concrete data types they should use in their programs? Why shouldn't the programmer be able to use safer, immutable data types allowed by more technically precise type-hints? Programming to interfaces instead of implementations is a commonly known principle & practice in software design: is that wrong? What is the intent here?

I'm probably not the only programmer who wants to get the most out of type hints if they're going to bother using them. Types can do better than this.

lmmarsano commented 3 years ago

Maybe my last comment was too verbose, and a picture is needed. Why is this wrong? validation error Why should I have to put a list there? I don't want list modification.