python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.23k stars 2.78k forks source link

Discussion: frontend, backend and "runtime" #2150

Closed elazarg closed 8 years ago

elazarg commented 8 years ago

This is an idea for the long-run, not a "please do now" request. I don't know if the right place for such discussion is here; let me know.

I think the architecture of the system should be more modular, comprised of three independent parts, with well defined interface:

  1. Frontend - parsing and semantic analysis
  2. Backend - translating the output of the front end to "assembly language" of type checkers, which is some constraint language (textual language)
  3. Runtime - evaluating the contsraints and emitting error messages

Coordination is performed in the driver and build manager as today, except that phase 3 is optional (through a flag similar to -s in gcc). There can be more than one target language.

The implementation overview describes four passes, but the data structures used are intermangled and not modular. Additionally, currently the "constraint language" is a hierarchy of Python classes and Python code.

In principle the system should be able to work without any Python object passed between phase 2 and 3 (I think that separating 1 and 2 is harder). This modularity will allow using other tools for each phase, and will be useful in case someone wants to plug in efficient constraint-solvers for scalability, or use mypy's constraint solver for portability. It is complementary to a plugin API, and will put less requirements on it.

(Of course, objects may be passed between 2 and 3 as an optimization, but this is just another "target language" which must happen through clear API,and my guesses is that such an optimization is unnecessary).

What do you think?

elazarg commented 8 years ago

It also gives a natural place for incremental checks - if the constraints are unchanged, there's no need to reevaluate them.

gvanrossum commented 8 years ago

I'm feeling pretty skeptical about this. It's like saying "We should reimplement Python from scratch, using this architecture." That's been done a few times (PyPy, Jython, IronPython, Pyston) and it has always taken way more effort than anticipated.

Your design may well be better, but whether it will succeed or fail won't depend on the design alone, it will depend on many details, and on the execution. I would much rather focus on improving the existing mypy implementation gradually than on some grand redesign, no matter how much more general that design might be. (Read Joel on Software: http://www.joelonsoftware.com/articles/fog0000000069.html)

I am also guessing that if it really was that easy, it would have naturally happened that way. Jukka is a really good and experienced engineer, and while you may not always immediately see the logic behind a particular implementation strategy, usually there is a really good reason.

elazarg commented 8 years ago

The first question is whether this architecture is indeed desirable in the first place, ignoring the costs. As you said, @JukkaL undoubtedly had his reasons, but the situation might have changed since 2014. If this architecture is not desirable, then there's nothing more to discuss; the decisions will be documented (if they aren't already), and I will learn something new.

Assuming the change is desirable, the question becomes that of cost/benefit. I understand the voice of experience says the cost is significantly more than it seems. But we can still discuss strategies for going in that direction in an incremental way, which might be feasible since the current design already use explicit separate passes. I have read Joel's post some years ago; I believe he has a point, and of course I don't suggest reimplementing anything from scratch; if that's necessary for the change, I withdraw my suggestion.

At the very least, a decision to move in this direction can justify certain refactoring (yes, I know, don't say anything), be a guide in implementing and reviewing new features (e.g. "don't entangle these two pieces of data") and help deciding which parts of the code deserve more documentation.

As a related but independent suggestion, it will be nice to have "assertion passes" between the phases, to verify (and document) the post-conditions of each pass and pre-conditions of the next, without adding more clutter to the code. Currently breaking something in e.g. the semantic analysis pass, is likely to result in (seemingly unrelated) error message in latter phases. Of course it cannot be avoided, but it can be reduced. After (gradually and slowly) implementing detailed assertion passes, it will be somewhat more straight forward to define a more explicit API between the passes, even if some intermediate state will still be implicit.

Another step might be serializing constraints as an optional step. Note that this is already done, in the form of stubgen - after all, a .pyi file is a constraint database, although with only partial information. There are two other parts that I can see: the usage constraints, and the rules of the type system. Again, as far as I can see (which is not so far, I admit) everything can be done incrementally, and it will already give the benefit of being able to plug in different solvers. Another step might be implementing deserialization of constraints, which again is already partially implemented in the form of consuming .pyi files.

gvanrossum commented 8 years ago

Maybe Jukka will want to continue this conversation.

JukkaL commented 8 years ago

Hmm, the proposal is not specific enough to write detailed feedback. I won’t suggest spending a lot of time on this, at least right now, but if you still think that this may be worthwhile after my and Guido’s responses, writing a sketch of how things could work and highlighting the concrete benefits could be helpful. Even though I’m skeptical, there are a bunch of things I can say without knowing more details. In particular, if you can split your proposal into smaller tasks that each bring incremental value, it's much easier to make decisions. Note that evaluating a big or complex proposal is hard and we may have trouble justifying the effort of even giving detailed feedback, as just that could take days.

I’m trying to explain here what sort of thinking goes behind the scenes then deciding whether to undertake a particular project that perhaps has some big potential benefits but that would also potentially take a big chunk of our development resources, such as this one. It’s more complicated than just a cost/benefit calculation. Other important factors include risk, uncertainty and “opportunity cost”.

For example, maybe the expected benefit is 4 and the cost is 3, for a net benefit of 33%. Yay! But wait, maybe there is a 33% chance of a total failure, and the work has to be scrapped. For example, maybe the approach doesn’t actually work for an important type system feature; maybe it’s too slow; maybe it can’t generate good enough error messages; maybe it doesn’t work well with the plugin system we’ve been thinking about. Now the expected payoff is negative, and the project doesn’t look very promising any more.

There are other sources of uncertainty:

As the mypy core developers clearly have doubts about the proposed approach, it would be useful to see how other (successful) languages that have some similarity to Python/mypy have done this. TypeScript, Hack, C# and Go should be sufficiently similar to provide useful insights. If there is a widely used approach that is similar to what you are suggesting, that would give credibility to your proposal and reduce the perceived risk level.

Languages that don’t resemble Python very much are somewhat less interesting, as it may be undesirable to apply those techniques to mypy, even if it would be technically possible. For example, Haskell is known to have a steep learning curve, and it’s something we want to actively avoid. Much of mypy design is motivated by learnability and user experience.

Again, assume that we are thinking about spending 3 units of work to gain a benefit of 4. There are always other things we could’ve been doing with those 3 units of work. Maybe we could have implemented some neat type system features or made the type checker twice as fast, and written better documentation. The benefit of these alternative tasks could be higher, let’s say 6, so they'd be more desirable. Here it’s important to consider also the benefits to mypy users, not just the developer team. The users may not see any immediate benefit from an internal change, but they’d most certainly benefit from a much faster mypy. Basically you’d need to argue that this is best of all possible investments at a given time.

A 6 month investment generating a future saving of 12 months of work due to a better design (a 100% return on investment!) over the next 5 years may actually not be very enticing, say if we’d only have funding for the next 6 months — it might be way more important to work on features that help secure more funding in the near term to ensure continued development.

Having a feature, refactoring or any other nice thing today is more valuable than having it only 12 months from now. A big redesign or refactoring postpones other nice things, and the benefits from the redesign are gradual and may take years to be fully realized, no matter how significant. However, if we’d not do the refactoring we could be doing other improvements right now, and we’d get the benefits sooner (and learning from user feedback on these features), and we could also drive improvements to the typing module faster.

[Well, that turned out to be a pretty long write-up. I'm hoping that I can reuse that in the future in similar contexts.]

gvanrossum commented 8 years ago

Awesome insights in the art of software engineering, Jukka!

elazarg commented 8 years ago

Thank you very much for these insights. I will give it thought.

(I crammed together the risk and relative costs into "cost", but the distinction is important to keep in mind so thanks for making it clear, too).

I want to note here that I believe pursuing this direction might get you more developers, if I will be able to convince them it will help their research (specifically, finding uses for Ivy). But that's a mere hunch.

I would still like your opinion about the suggested architecture, ignoring costs and risks :) but now as a personal favor only.

JukkaL commented 8 years ago

I'm skeptical about close collaboration with researchers -- their incentives generally wouldn't align with ours. We don't care about being novel and they typically don't care about long-term maintainability and polishing their implementations to production quality (though there clearly are exceptions). Using boring, well-understood, battle-tested solutions is something we generally want to do.

Here are some thoughts about other aspects of your proposal (these are pretty random and in no particular order):

elazarg commented 8 years ago

Wow. That's thorough... Thank you!

I'm skeptical about close collaboration with researchers

I understand your concerns regarding researchers. My thinking was that researchers can help build the infrastructure that will enable their research and will be useful for the industry in general, without putting any "novel" stuff in the main project; admittedly the code quality issue is still there. (As perhaps turns out to be the case in my grand-refactoring suggestions and PRs. I think it's more due to inexperience than to wrong incentives, though; I try to learn).

my impression was that I'm not smart enough to design them

You just did, didn't you? :) I mean, I don't entirely understand in what sense isn't mypy a constraint solver. A very specific one, of course, but as long as these constraints are serializable in any way, it still is.

... constraint systems seem to make it harder to generate useful error messages

Unlike you, I don't have any experience with constraint solvers, which might be the cause of some incorrect assumptions. It is not the first time I hear that constraint solving and static analysis yield poor error messages, though; but isn't it inherent in the problem? I'd thought that giving useful error messages is inherently hard and even poorly defined, since an error is useful only if it points to "the source of the problem" which depends on unexpressed intents of the user. So unless you can find a small number of constraints (in the problem description itself, i.e. in the source code) that comprise an "unsatisfied core" and print them all (as mypy does in a simple type mismatch), you have to guess the user's intent, and that's always hard.

Somewhat related and might interest you: Why JavaScript Programmers Hate You: an ode to dynamic languages by Jan Vitek (link to the specific point on gradual typing).

elazarg commented 8 years ago

This issue may be closed, as far as I'm concerned. Do you want me to open a new issue about assert passes?

JukkaL commented 8 years ago

(As perhaps turns out to be the case in my grand-refactoring suggestions and PRs. I think it's more due to inexperience than to wrong incentives, though; I try to learn).

Fair enough. I wasn't talking about people in the academia (sorry if I gave that impression), but the nature of many academic CS research projects -- in my experience the main research end product typically is a paper (or several), and the implementation is often a proof of concept which will be thrown away when a project ends. If you have to hop between research projects, it can be difficult to maintain continuity and find the time to polish your code.

I mean, I don't entirely understand in what sense isn't mypy a constraint solver. A very specific one, of course, but as long as these constraints are serializable in any way, it still is.

Well, I really meant a "general" constraint solver. Mypy has a pretty specialized constraint solver in it and it doesn't present the problems what I was talking about. Also, I don't think that it would be useful to serialize the constraints, as the constraints are used temporarily in very local contexts and then thrown away. (Though perhaps I misunderstood your meaning.)

It is not the first time I hear that constraint solving and static analysis yield poor error messages, though; but isn't it inherent in the problem?

There clearly are design tradeoffs. Generally, the smaller set of constraints you work with at a time the more predictable the system is. That's why mypy does local type inference, and generally gives up pretty soon it encounters something out of the ordinary instead of trying to do something very clever. Contrast that with whole-program type inference: if the tool gets one thing wrong, the invalid result may propagate through multiple functions or modules and manifest itself in totally unpredictable ways.

JukkaL commented 8 years ago

And yes, please open an issue about assert passes!