Open erszcz opened 1 year ago
Hmm, now that I look at the results again, they don't look THAT bad:
Failed: 20. Skipped: 0. Passed: 708.
Maybe it's not such a bad idea after all... I remembered the result from June as much worse 🤔
Nice! I believe this is a good approach, especially since CDuce is the language where these theories by Castagna et.al. are tested. The two papers mentioned below are listed on the web site of CDuce. (I read them some years ago.)
In Gradual Typing with Union and Intersection Types (2017) they talk about DNF and Gradual-DNF and some other concepts and steps how to convert types into this form, compute subtyping, etc.
In Covariance and Contravariance: a fresh look at an old issue (a primer in advanced type systems for learning functional programmers) (2018), there's an efficient datastructure "binary decision diagram with lazy unions" to represent types in DNF and an algorithm how to transform types.
Indeed, especially "Covariance and Contravariance" is very informative. It contains one more brilliant idea which I tested in Gradualizer, but with quite a dramatic impact on test results. The idea is defining subtyping by type difference.
Currently, in Gradualizer type_diff
and subtype
are defined separately. The paper proposes to define subtyping like this:
subtype(T1, T2) -> is_empty(type_diff(T1, T2)).
While this PR's approach to DNF doesn't seem to be as dramatic, I think it's not complete either. Together with subtyping redefinition, we'd be getting very close to the theory described there. The question, though, is whether it's worth to change fundamental parts of Gradualizer in the light of etylizer being developed from scratch according to that theory? As we speak it still has small coverage of the Erlang language features, but it seems to have more sound and better understood theoretical underpinning 🤔
Etilyzer is not gradual typing, is it?
Anyway, I don't have time and energy to do such major work in Gradualizer at this time.
Etilyzer is not gradual typing, is it?
At the very least they accept the dynamic type on the syntax level, so I imagine they would like to expand in that direction.
time and energy
I'm afraid that's the main problem. There's a chance for some EEF (Erlang Ecosystem Foundation) funding (I spoke to one of the board members about it, nothing certain, but still), but apart from a budget, we would also need people with capacity. cc @xxdavid @josefs
Good idea! Thanks for experimenting with it. :)
20 failed tests don’t sound like that much. It seems to me that most of them are related to maps and glb, so maybe it would be sufficient to patch these places. But it might also require some deep refactoring, who knows…
Also, have you thought about the exponential explosion? Do you think it’s safe to use this expansion on type expressions of “practical size”?
Regarding the redefinition of subtype
, what do you think would be the advantages (and possibly disadvantages) of it compared to the current implementation?
Funding for Gradualizer sounds really cool! I think it could help the project a lot. Not having anyone who would work on it regularly and as part of their job is one of the things that distinguishes Gradualizer from eqWAlizer and etylizer and probably one of the reasons why it doesn’t move very fast (compared to the other projects). Sadly, I don’t have much time for it either. I’m currently immersed in trying to improve the polymorphism support, and do not have much capacity left.
20 failed tests don’t sound like that much.
Indeed, I might give it a go and try fixing them when I find a while.
Also, have you thought about the exponential explosion? Do you think it’s safe to use this expansion on type expressions of “practical size”?
I don't know :| I think it can be managed, given CDuce, etylizer, and upcoming Elixir systems all have to use this approach (because they follow the paper suggesting to use DNF).
Regarding the redefinition of subtype, what do you think would be the advantages (and possibly disadvantages) of it compared to the current implementation?
Currently we have two somewhat redundant and not completely compatible
relations comparing types in Gradualizer: subtype
and type_diff
.
refinable
succinctly defines for which types type_diff
is defined. This
has led to some confusion a few times already, e.g. here -
https://github.com/josefs/Gradualizer/pull/346#discussion_r719151072. To
sum up, type_diff
is not total on types (type pairs).
So type_diff
"compares" some types to tell if one can be refined with the
other and to tell what the result is. subtype
compares types to tell if
one is a subtype of another. This is quite a similar notion. It seems to me
that disjoint
is another way of saying "not a subtype or supertype". If
one type is a subtype or supertype of another, then some refinement in that
pair must be possible, either to a narrower type or, ultimately, to
none()
) - expressing subtype
with type_diff
seems natural in this
light. However, subtype
has to be a total relation - otherwise we
couldn't use it on arbitrary types, but our type_diff
isn't.
Practically, unifying them would mean we could have just one function implementing the not so trivial logic. I'm not sure if it would give any other significant benefits (like expressiveness), but it would help clarify the concept which currently seems to be a bit hazy in Gradualizer.
Funding for Gradualizer sounds really cool! [...]
Sadly, I don’t have much time for it either.
I think if the funding happened, it would be a fixed budget / one time grant, not a long-term cooperation / employment opportunity. However, it seems none of us, the latest contributors, can invest the time or considers it compelling :|
I'm creating this draft PR with an intention of closing it immediately afterwards. It's not a working solution, I just want to describe the problem and show an experiment aimed at solving the problem.
In https://github.com/josefs/Gradualizer/pull/524#discussion_r1220414681 I said I'd experiment with disjunctive normal form to address the following problem from Gradualizer's own code. The problem is quite prevalent - it occurs everywhere where we have a structure with an inner union. For example, pretty much every time we handle
type()
instances:In the above example we see that even though there's a dedicated clause matching the case of
Args1 = any
(and symmetricallyArgs2 = any
), we still have to assert thatArgs1 :: [type()]
(respectivelyArgs2 :: [type()]
) in the clause which handles exactly that -Args1
andArgs2
already refined to[type()]
. We have to use a superfluous assertion, even though it's obvious to us and should be obvious to the typechecker.I think it's caused by the following:
In other words, while Gradualizer understands that
{t, a} | {t, b} :: {t, a | b}
, it does not understand that{t, a | b} :: {t, a} | {t, b}
. It seems to me that this property holding in both directions is crucial for our needs.As you can see in
notes.1.md
in this PR, I contrasted it with the Elixir prototype based on the CDuce set-theoretic type checker. For Elixir/CDuce, the property holds in both directions.To some extent it can be achieved by "expanding" unions as done in this PR - and it fixes some particular cases, especially the new tests in
typechecker_tests
- however, the general effect on tests is disastrous :(