Open lmmarsano opened 3 years ago
Can you elaborate on Sequence
here, as I'm not super familiar with that type? As I read this, I read the List[str]
as a definition of the intent of the function. If using Iterable[str]
, you can pass a str
in as a string is iterable that returns a string. (X-Ref: https://github.com/python/typing/issues/256)
Types are for more than intent. They're used for static type checking. Try out mypy or pyright. They'll throw errors if you try to pass a tuple as a list, since lists permit operations tuples forbid. To get the most out of typed functions, you'd declare parameters with the most inclusive types possible that permit only the operations your function actually needs, and return the most specific type the function actually returns. This way, the type system allows the caller to pass in anything that will work, even the bare minimum needed, and the type system still allows the caller to do everything the returned values actually allow. The most inclusive types are typically abstract interfaces, and the most specific are concrete implementations. Type annotations rejecting good parameters indicate incorrect annotations.
I think what @kkirsche meant was that a string abcd
is a valid Iterable that you can loop through:
for x in "abcd":
print(x)
would print:
a
b
c
d
also isinstance
checks are valid on a string:
from collections import Sequence, Iterable
print(isinstance("abc", Collection))
print(isinstance("abc", Sequence))
would both be True
and that is not what is intended for the envvar
. A single string would mean there is only one variable passed, not to iterate over it thus the Union
type for envvar
and probably some check for the list in the code.
Are we drawing conclusions that don't follow? Consider
import collections.abc as CA
def normalize(input: CA.Iterable[str]) -> tuple[str, ...]:
return (input,) if isinstance(input, str) else tuple(input)
Nothing in the general type annotation forces the implementation to iterate over each character in a string: the implementation is free to do anything that meets type requirements.
Yes, str
is a free monoid, so CA.Iterable[str]
= Union[str, CA.Iterable[str]]
, and the latter would be kinder to the user.
However, I wouldn't declare input: Union[str, list[str]]
unless I want to be unkind to the user (& myself).
Now inspect the implementation.
What operations does it actually use?
What operations does it need?
Does it actually mutate the envvar
list?
If not, then it could be tuple.
Does it actually use index operations?
If not, then it doesn't need to be a tuple either.
In fact, I'm not seeing the code here do much with envvar
other than pass it to click
, and according to the documentation there, it doesn't need list.
I think we have very different definitions on the purpose of types, which is fine. It sounds as though you operate from the pure type perspective, meaning the purpose of types is to document what a function could operate on, influenced from languages like Haskell. This is perfectly valid, it simply differs from what I’ve encountered and find in code of my peers. In practice, I find that teams I’ve worked with use types and type checking as extended documentation. Abstract typing can be valuable, but if I want the user to pass a list, I document a list even though it could operate on other types. Both can be valid, but they aren’t really compatible philosophies.
Agree to disagree on their purpose. I appreciate you taking the time to explain.
python/typing#256: the type hint Union[str, List[str]]
already accepts str
, so generalizing list
to Sequence
or Iterable
isn't a problem.
That discussion only points out we can't exclude str
from Iterable[str]
or Sequence[str]
, which is fine.
Type hints don't change implementation, and this code already exists.
The request is to improve accuracy.
Documentation: this is an admirable goal that I fully support. Wouldn't types that more accurately specify requirements behavior improve documentation? Declaring requirements with precise interfaces is entirely compatible with good documentation.
envvar: Optional[Union[str, Sequence[str]]] = None
seems clear in documentation without placing contrived, counterproductive limitations on implementers using this library.
As was pointed out, even click
documents envvar
as Sequence
despite not using type hints.
Notice this discussion only exists because unlike documentation, type hints can restrict programs from validating. If it were only a disagreement over documentation, then a programmer who knew better could just call this function any valid way Python's dynamic type system allows without debate, and the valid program will run successfully. Here, a deficient choice of type declarations prevents valid programs from validating. That's no longer a difference in "philosophy", or a matter programmers should readily "agree to disagree" on. Poorly chosen type hints create a false dilemma: now the programmer must choose between writing suboptimal code to appease ill-conceived type hints, or to write their optimal code and override the type system (possibly introduce error) or go blind (possibly introduce much error). That is unnecessary & counterproductive. Type systems in programming languages have existed since conception to prevent type errors and validate correct programs.
If someone is going to claim typing by subjective "intent" is valid, and that "intent" unnecessarily classifies valid Python code as invalid to static type-checkers, then they better be ready to justify their intent.
Why is list
better than any list-like data?
Do you know better than the programmer the concrete data types they should use in their programs?
Why shouldn't the programmer be able to use safer, immutable data types allowed by more technically precise type-hints?
Programming to interfaces instead of implementations is a commonly known principle & practice in software design: is that wrong?
What is the intent here?
I'm probably not the only programmer who wants to get the most out of type hints if they're going to bother using them. Types can do better than this.
Maybe my last comment was too verbose, and a picture is needed.
Why is this wrong?
Why should I have to put a list
there?
I don't want list modification.
https://github.com/tiangolo/typer/blob/c3a4c72a4f1a18b7f16addb508862ec63411fa63/typer/params.py#L14
Why
List
? Do we expect to mutate the data? I'd expect immutable types allowed. Maybe this should be generalized to an abstract interface such asSequence
orIterable
. In general, any overspecialized parameters should be generalized.