Add datasort refinements to Rust

catamorphism commented 12 years ago

Looking through the rustc source code, it seems like many potential areas for typestate predicates involve cases where a value has an enum type and is known to be built with one of a subset of that enum type's constructors. I'm especially noticing this now that I'm going through and making all alts exhaustive. For example:

enum foo {bar, quux, baz}

fn whee(x: foo) -> int {
  alt x {
     bar { 42 }
     _ { fail "The impossible happened!"; }
  }
}

We'd like to give whee a type that says x must be constructed with constructor bar -- in other words, a type that's a refinement on foo -- so that we don't need to write the fail case (because we can statically check that x is in the right subset of foo). We can do this right now with typestate predicates. However, datasort refinements seem to be a common enough case that it might be worth treating them separately. Also, datasort refinements can be implemented without needing to worry about any of the purity issues that have made typestate difficult, because once something of enum type is constructed, its tag is immutable.

The other reason to handle this separately from typestate is that the compiler never uses typestate information to eliminate any checks (since predicates may be unsound), whereas datasort refinements can be checked statically and used to eliminate unnecessary tests in alts.

Thoughts?

graydon commented 12 years ago

This is a good observation and in general I agree with it.
Disjoint unions are implemented using typestate in Hermes, using this mechanism. Check the papers :)
I've proposed permitting tag constructor names as implicit predicates in the past, precisely for this reason.
I don't see a strong distinction between "datasort refinement" and "predicate applied to single immutable value". Except possibly implementation mechanism. Can you elaborate on the difference?

catamorphism commented 12 years ago

In response to the last question: the difference is just that the compiler has special knowledge about datasort refinements and can use it to eliminate checks. In the example I already gave, if bar had one field, then if whee was given a refined type, the compiler could compile away the alt and just extract the field directly -- no tag checking would be needed.

In general it's hard to see how arbitrary predicates applied to single immutable values can be exploited by the compiler to do optimizations.

marijnh commented 12 years ago

now that I'm going through and making all alts exhaustive

Please don't. I agree that alts should be exhaustive by default, but some just aren't, and being able to use a non-exhaustive alt as an assertion is a very valuable thing to have.

I assume you're just adding _ { fail "something..."; } clauses to such non-exhaustive alts. This tends to clutter the code, and produce no more informative error message than the default failure generated for such things by the compiler.

I'd much prefer if you added a way to explicitly specify that an alt is non-exhaustive, which would disable the warning/error for non-exhaustiveness, and make the compiler generate the implicit error path that current alts have.

catamorphism commented 12 years ago

Marijn: What do you think of the proposal above? It accomplishes the same goal as your suggestion ("a way to explicitly specify that an alt is non-exhaustive"), but with less clutter (because annotations are part of a type signature, there will be fewer of them), and with a static guarantee that there will be no non-exhaustive alt failures (because it becomes possible to check statically that all alts are actually exhaustive).

marijnh commented 12 years ago

The problem is that proving that a given value was constructed by a specific (subset of) variant will, in a lot of situations, be equivalent to the halting problem. So you get the same thing that makes typestate painful to use: you'll have to run a check on your value (and probably also first put it in a local variable), to be able to pattern-match on it.

In some programs this might work out, but I think there'll be enough cases where it won't, in which a non-statically-proven-safe form of non-exhaustive alt is still useful.

catamorphism commented 12 years ago

Another problem that I see is that inserting an "alt-unsafe" (or whatever) form doesn't encourage the programmer to document why it's actually safe to assume that the alt will actually match. Allowing the programmer to state invariants by giving more precise types to values than is currently possible encourages more rigorous reasoning, IMO. "alt-unsafe" could encourage people to just throw it in until the compiler is happy.

catamorphism commented 12 years ago

Where I'm coming from with this is that I'm looking at the many, many non-exhaustive alts in rustc and asking: "Why is this alt guaranteed to be exhaustive? What's the invariant?" And since in every single case, there is no documentation of why the code author expected that to be true, it's always unclear whether the non-exhaustivity is an error (that has coincidentally worked so far) or intentional. And I don't think having a different alt-unsafe form does much about that problem. I guess it does document one bit of information about the programmer's intent (that they believe there's an invariant), but not what the invariant is.

marijnh commented 12 years ago

Well, at the risk of sounding somewhat snarky, the invariant is that one of the patterns in the alt will match. I agree this doesn't usually amount to a perfect explanation, but neither will a special type or a predicate being applied to the value.

(Don't get me wrong, I think the feature proposed here would be awesome. I just don't think it's a full solution to the unexhastive alt problem.)

catamorphism commented 12 years ago

Well, you could also say that in a unityped language, the invariant is that there will be no dynamic type errors. However, statically typed languages allows richer invariants to be specified through the type system.

In this case, the reason why the programmer believes that one of the pattern will match presumably has something to do with dataflow and control flow properties, and types are a way to express an approximation to these properties. As you pointed out, the properties that need to be specified may be undecidable in general, but I would say that if it's not the case that most invariants that currently underlie inexhaustive cases in rustc are simple to state, then there's something wrong. The whole principle behind static typing is that while some of the properties we might like to check are undecidable, many useful properties can be captured in a decidable type system.

graydon commented 12 years ago

I have a wondrous solution!

You know how we have if check that combines the flow-control goodness of if with the typestate nutrition of check?

I hereby propose alt check which does the same. Specifically:

The form of the alt has to be a simple N-way branch on the enum ctors. No nested matching or guards.
If control passes the alt, it went through one of the arms, so the enum refinement of the disjunction-of-the-arms is valid in the poststate.
If you didn't say alt check then nonexhaustive matches warn. You can inhibit the warning if you want.
You can write _ { } if you want fallthrough.
If a given enum-refinement already holds, it can be used to prove a non-exhaustive alt legal anyways. IOW if you were to say check x | some(*); alt x { some(y) { ... } } this would compile w/o error.

This actually generalizes nicely to refutable patterns in let bindings too. Add let check:

let check some(y) = x; establishes the enum refinement on x (or fails), then unpacks its value to y.
check x | some(*); let some(y) = x; means the same thing and compiles w/o error.
let some(y) = x on its own gives a refutable-pattern warning (that you can inhibit).

graydon commented 12 years ago

(Granted, this has the slightly non-intuitive behavior that you have to write alt check to get the alt that fails at runtime, where you might think alt check is actually the strict alt that demands exhaustiveness. Tolerable UI idiosyncracy? I do think we want alt to default to exhaustiveness-checking, that's the thing ... which argues for "you have to say a little bit more to get non-exhaustiveness")

marijnh commented 12 years ago

check can be interpreted as 'adds a run-time exhaustiveness check'. I think it's fine.

catamorphism commented 12 years ago

Graydon: this looks OK to me, though I would prefer to call it something different from "alt check" (I'm not sure what). A construct like "alt check" would be necessary anyway for introducing values that have refined types. The only thing non-standard in what you're suggesting is that, basically, the "failure" case would be implicit (the compiler would insert an implicit fail in the case where none of the listed constructors match).

catamorphism commented 12 years ago

I think the name "alt check" is confusing because "if check" is the version of "check" that never aborts the program if the check fails -- whereas an "alt check" can abort the program.

graydon commented 12 years ago

Yeah. It's true, the thing is not exactly analogous to if check (in the sense that in if check the predicate holds in the true branch but we don't implicitly fail in the false branch. Hmm.

nikomatsakis commented 12 years ago

I strongly disagree with @marijnh. I want to know when my alts are not exhaustive (it can be a warning, as long as we have -Werror :). Further, I want a way to prove to the type system that my alt IS in fact exhaustive, because I only need to consider a subset of cases here. We have neither of those things right now. I've found that Scala (which does support this sort of thing, through subtyping) made enums substantially more expressive than O'Caml.

@pcwalton @dherman and I have been kicking around ideas on this for a while. An early proposal was something like:

enum expr {
    enum tuple_like { 
        ex_tup()
        ex_rec()
    }
    enum lvalue {
        ex_path()
        ex_index()
    }
}

However, this was not satisfactory for several reasons. First, rightward drift. Second, it only supported trees. What if I had a case that was both an lvalue and tuple-like?

@pcwalton just proposed an interesting idea where you can "expand" an enum, so you might write:

enum tuple_like {
    tup()
    rec()
}

enum lvalue {
    path()
    index()
}

enum expr < tuple_like, lvalue {
    some_other_case()
}

Note that expr here is a supertype (not subtype) of tuple_like and lvalue. If I wanted a case that was both tuple-like and an lvalue, I could do:

expr both { ... }
expr tuple_like < both { ... }
expr lvalue < both { ... }
expr expr < lvalue, tuple_like { ... }

This also neatly solves the problem of "open enums" that forced Scala to add sealed. Nifty.

This does imply subtyping between enums: but no solution to this problem gets around that. (Refinement types will have subtyping relations based on their predicates)

Anyway, this is a straw man.

graydon commented 12 years ago

If I may ask to calm the conversation a bit; the third adjective in our language summary is "practical", so I don't want to consider scala as our target-to-beat. It's blown the user complexity budget several times over.

We have managed to avoid introducing general subtyping so far and I would like to continue trying to avoid it. Part of the purpose of the typestate system is to let off steam that would otherwise push us towards more-elaborate static reasoning and over-burdening the type system. Inference and exactness can remain out of reach; the motivated user can add check expressions to re-assert membership.

Further consideration of if check vs. alt check apparent "asymmetry" makes me think it's not really there, that alt check still makes good sense as a solution. If we consider alt as "always exhaustive", then leaving off an arm is always a static error, in a way that omitting else from if simply isn't. So it makes sense to consider alt check as "check-or-fail + exhaustive-conditional-branch", in symmetry with if check meaning "check-or-fail + nonexhaustive-conditional-branch". We want if to always implicitly fall through, by default. We want alt to always implicitly not-fall-through. So they're already different. We'd just be preserving that difference between if check and alt check.

catamorphism commented 12 years ago

@graydon (or anyone) -- any comment on separating the implementation of datasort refinements from the existing typestate mechanism (so that they can be used to eliminate checks)?

nikomatsakis commented 12 years ago

Long story short: I would like to solve this problem. If it can be done well through typestate I'm interested. Otherwise, I think extensible enums seem promising. I think both refined types and extensible enums are a form of subtyping and are more-or-less equivalent in that regard.

I don't want to consider scala as our target-to-beat

I agree. I just am not above borrowing a good idea.

We have managed to avoid introducing general subtyping so far and I would like to continue trying to avoid it. Part of the purpose of the typestate system is to let off steam that would otherwise push us towards more-elaborate static reasoning and over-burdening the type system.

I don't understand how having subtyping only in refinements is not subtyping.

Inference and exactness can remain out of reach; the motivated user can add check expressions to re-assert membership.

Why is this any less true here? The inference may not always pick the right thing. Maybe you have to specify the type manually or write a "downcast" (let x = check x as tuple_like). I mean, a check predicate is a downcast too.

All that said, I am sympathetic to wanting to use typestate to solve this problem so that we don't have too many tools in the toolshed. I'd like to see a more specific proposal: I imagine we'd want to have special "per-variant" predicates and then some way to declare a predicate that says "it's one of these 5 variants". These variant predicates would not be opaque to the compiler (unlike other predicates). I haven't read up on the past work you cited; I will soon (bit busy right now). I could imagine it being useful to have the dataflow propagation.

@catamorphism I am not too worried about the soundness issue. If you write a bogus unchecked predicate and then you end up with an exception about "unhandled case" in an alt statement that is supposedly exhaustive, I guess that's your problem.

catamorphism commented 12 years ago

@nikomatsakis What I'm worried about with bogus predicates is the compiler taking advantage of knowledge about impossible cases, and generating code that results in a segfault if the predicates return misleading results.

nikomatsakis commented 12 years ago

@catamorphism Yes, I know. That is something to keep in mind for sure. I just mean that allowing us to prune down exhaustiveness checking will not, in and of itself, lead to seg faults. I can imagine that other optimizations might.

graydon commented 12 years ago

@catamorphism A separate pass is plausible, I guess? Predicates are all supposed to be pure though. Redundant check elimination ought equally well to apply to any predicate, if it's statically holding in the typestate of the check.

@nikomatsakis

Why is this any less true here? The inference may not always pick the right thing. Maybe you have to specify the type manually or write a "downcast" (let x = check x as tuple_like). I mean, a check predicate is a downcast too.

This gets a bit language-theoretical, but I'll try to state my beliefs as concisely as possible. Sorry if they're a bit unorthodox. Mainly I disagree with the broadest interpretation of the phrase "types are theorems". I think it is a bit of an oversimplification of language technology. Types certainly are theorems, but I think they're not just "any old theorems", and I think a "type checker" is not the right place to prove every theorem.

Specifically, the theorems I think "belong" in types are those that describe the correctness of actual machine-level instructions such as arithmetic (are we operating on the right kind of number) and load/store (are we following pointers into dead memory). Correctness at that level is usually plausible to build a type system for with very nice properties:

Decidable, fast
Gives comprehensible error messages
Mostly-inferable from terms, stays out of the way

Lots of type systems don't have those properties, but I think those languages are abusing their type systems and users often pay the price: by mixing "nice" and also very necessary machine-representation questions ("can I navigate the data structure") with higher-level program-correctness issues that have been "encoded" into types. Everyone has some favourite haskell, ML, scala or C++ error messages they can mention from such delightful "encoding" of deeper semantic checks in the type system.

I think the cost in program comprehensibility and compile time by mixing those issues into the type system is not the right balance, and it's better to add secondary layers of static analysis beyond the types. Even though in a strict Curry-Howard sense the secondary layers are just "more types", they're types that may be expressed through a completely different language-level UI (unordered predicate lists), inference and checking rules (flow-sensivity), set of manual overrides (runtime predicate-checks vs. reinterpret_cast) and error-reporting system.

IMO those differences matter.

dherman commented 12 years ago

We should try to avoid debates getting too abstract in bug threads. Let's discuss feature concerns and proposals in our group meetings.

marijnh commented 12 years ago

I strongly disagree with @marijnh. I want to know when my alts are not exhaustive (it can be a warning, as long as we have -Werror :). Further, I want a way to prove to the type system that my alt IS in fact exhaustive,

Neither of these conflict with my proposal, so maybe you just didn't read it very well.

nikomatsakis commented 12 years ago

Neither of these conflict with my proposal, so maybe you just didn't read it very well.

I read your proposal. I'm not necessarily opposed to having some sort of way to say that an alt is intentionally non-exhaustive. However, I think if we do our job right, it should be very rarely used. Therefore, I am not sure that I want to make it easy to write non-exhaustive alts, I'd rather encourage people to use the type system and prove that those cases are impossible.

p.s., @graydon, I somehow missed your post regarding alt check before. I guess you typed that while I was typing my post and I didn't refresh the github window. Interesting idea. I think I'd prefer it if alt and alt check behaved the same. That is, both have exhaustiveness checking, and both fail if there is no matching arm. That makes it more analogous to if check: it has the same behavior, it just says that you want the compiler to track the result. If it is so painful to write _ { fail } or _ {}, we could consider marijn's proposal.

nikomatsakis commented 12 years ago

Also, I don't see why we have to limit the forms of patterns that are allowed in an alt check, though it might make sense if the pattern guards had to be pure functions (or else allow if check in a pattern guard), so that they too could be added to the typestate. Basically, if we allowed arbitrary patterns, we could still add some predicate regarding the "top-level" pattern. That is:

alt check foo {
    some(some(bar)) {
        // adds a predicate foo | some(*)
    }
    some(none) {
        // also adds a predicate foo | some(*)
    }
    none { ... }
}

That said, I think @dherman is probably right that we ought to continue this in another forum---which I am not respecting, sorry about that---and anyhow I'd like to let the various ideas sit for a bit. So I'll stop writing now. I am mostly just turning over this idea about alt check in my head and wanted to get some thoughts down before I run out the door.

Anyway, I think the critical question is: "Besides datasort refinements, what else do we plan to use typestate for?" So far it hasn't seen much (productive) use, unfortunately. If we do find that we use it for a lot, it probably makes sense to use it for datasorts as well. If datasorts are all we use it for, I am not sure it's the simplest way to get these benefits.

bblum commented 12 years ago

I see that I am a little late to the party.

I won't comment on the various ways of dealing with nonexhaustiveness, but I think it is very important to have arbitrary refinements for subtyping enums, so the type system could let us match exhaustively when we want to match exhaustively on a subset of things.

@nikomatsakis wrote:

expr both { ... }
expr tuple_like < both { ... }
expr lvalue < both { ... }
expr expr < lvalue, tuple_like { ... }

I propose the following syntax, which allows you to write one definition per useful subset instead of one per intersection-of-subsets (2^n?). (said perhaps more intuitively: one per circle in a venn diagram, instead of one per area in a venn diagram.)

enum expr {
        ex_tup(...),
        ex_rec(...),
        ex_path(...),
        ex_index(...),
        ex_tuple_lvalue(...),
}
enum tuple_like : expr { ex_tup, ex_rec, ex_tuple_lvalue }
enum lvalue : expr { ex_path, ex_index, ex_tuple_lvalue }

All you have to write in the refinements are the arms - no need to repeat the stuff they carry (since they have to be the same). If @pcwalton adds his case-class-y idea of having fields shared in all arms, you wouldn't need to write duplicates of that in the refinements either for the same reason.

The : syntax is a nod to C++ class inheritance, and hints strongly at the subtyping (hopefully in a way that will be intuitive even to people unfamiliar with disjunctive types).

catamorphism commented 12 years ago

I note that my "once something of enum type is constructed, its tag is immutable" comment in the OP is now obsolete -- if we implement this, it's going to have to deal with mutable algebraic data somehow (requiring refinement-typed things to be immutable is a totally reasonable way of dealing with it).

bblum commented 12 years ago

For those who aren't convinced that the need for arbitrary-subset refinement comes up in practice, my motivation comes from a bug I had writing a C-like compiler in undergrad. Liveness analysis was done on an abstract assembly language, which looked like this (originally haskell, translated into rust).

enum Instruction {
    BinExp(...),
    Load(...),
    Store(...),
    Jump(...),
    Call(...),
    Return(...)
}   

fn usedVariables(inst: Instruction) -> ~[Variable] {
    alt inst {
        BinExp(_, t1, t2)   { ~[t1, t2] }
        Store(_, t)         { ~[t] }
        ...
        ... more similar cases follow ...
        ... 
        // All other instructions don't use any variables!
        _                   { ~[] }
    }   
}   

fn successorLines(inst: Instruction, linenr: uint) -> ~[uint] {
    alt inst {
        Jump(lbl)   { ~[lookupLabel(lbl)] }
        Return(_)   { ~[] }
        // All other instructions don't alter flow control.
        _           { ~[linenr + 1] }
    }   
}

It wasn't until late in the project that I added JumpCondition as another arm in the language. Unfortunately the liveness code was in another file, and I missed adding cases for it in these functions because the catch-all prevented there from being a compiler error.

What I really wanted was to get rid of the catch-all and replace it with arbitrary refinements, like this:

enum FlowControl : Instruction {
    Jump, Return, JumpCondition
}
enum VariableUse : Instruction {
    BinExp, Load, Store, Call, Return, JumpCondition
}

Although I will also note that I'd need some equivalent of instanceof to know whether to use the catch-all logic or not to begin with. On one hand, maybe i should've used an oo-like trait pattern. On another hand, maybe I could've written something like:

fn isFlowControl(instr: Instruction) -> option<FlowControl> {
    alt instr {
        x@Jump(*) | x@Return(*) | x@JumpCondition(*) { some(x) }
        _ { none }
    }
}

catamorphism commented 12 years ago

It already comes up all the time in rustc, IMO -- every single alt check expression is a case where a refinement should really be used instead.

nikomatsakis commented 12 years ago

@catamorphism I am not sure what you meant by this:

I note that my "once something of enum type is constructed, its tag is immutable" comment in the OP is now obsolete -- if we implement this, it's going to have to deal with mutable algebraic data somehow (requiring refinement-typed things to be immutable is a totally reasonable way of dealing with it).

I guess I am not sure what you mean by "this". If we use the case class-like approach, in particular, this kind of correctness seems to just fall out of the standard soundness of the type system.

catamorphism commented 12 years ago

@nikomatsakis By "this" I meant any sort of datasort refinements, but I guess I'm still not familiar with the case class proposal (has it been written up somewhere?), so I wasn't thinking of that approach in particular.

graydon commented 12 years ago

I'm more curious about how you would actually use arbitrary-subset forms, not declare them. As in: what would the type checker infer for the JumpCondition constructor when used as an expression, a pattern, etc.? I suppose you'd be doing something like "intersect all possible refinements for a given type variable and complain if there's more than one"; but how does this scale across multiple crates if JumpCondition can randomly be included in new ones? Does it become a new thing the coherence check has to restrict? I.e. do we need to restrict it to only permit refining an enum inside the declaring crate?

Alternatively, if it's really just about missed pattern cases, you could have possibly gotten by with other mechanisms:

Macros:

#macro([ #FlowControl(...), Jump(...) | Return(...) | JumpCondition(...) ])

alt x {
   ...
   #FlowControl(*) { ... }
}

Pattern guards:

fn is_flowControl(instr: instruction) -> bool {
    alt instr {
        Jump(*) | Return(*) | JumpCondition(*) { true }
        _ { false }
    }
}

alt x {
    ...
    v if is_flowControl(v) { ... }
    _ { ... }

The former gets you static exhaustiveness checking, the latter gets you turing-complete refinements.

As far as the suggestion that all our current alt check cases are subsumed by this: that is simply not true. The point of alt check was to indicate non-exhaustive alt in a way that involved less clutter than writing catchalls that do nothing more than fail, so that we could upgrade alt to be exhaustive by default. The non-exhaustiveness is not (from what I can tell) usually to do with the type not being specific enough, it's due to the expression's value being data-dependent, so not something we can pin-down fully into a type.

I agree that, in general, the reason 'effectively-non-exhaustive" alts exist in the first place is that the type language can't encode enough information to "push the check back further", but that line has to be drawn somewhere -- types are not turing complete and don't have access to the runtime environment for such minor details as "all the data the program's behavior depends on" -- so I'd suggest looking in detail at each of the alt check cases and trying to figure out how much "further back" a refinement would push the check. Hashtable contents? Results of a unification? Next character read from a stream? About the only ones I can see are the AST subset cases -- that is, the encodings for "this has to be an lval" and such -- and it's not clear to me that the partial support for refinements @pcwalton proposed on IRC, that I asked @bblum to write up and compare with "arbitrary refinements" here, isn't enough.

Since it didn't get written down above, here's what @pcwalton suggested on IRC:

Each enum constructor is, itself, a type. So a bunch of 1-member refinements all exist.
Each enum has a struct of common fields that are accessible in all values of the enum type, without alt-ing.

This combination lets you encode "tree-shaped" refinements, so to speak, by factoring your enum into a set of distinct sub-enums, sharing fields whenever possible. It means you have to specify 1 ctor for each level of the "refinement tree" when constructing or pattern-matching, but you can use each of those levels as a type as well, if you want to "push a check back" to outside a function boundary and only handle a subset of cases. So it's sort of a middle ground. The user has to do more work and can express less, but the semantics are less complicated.

I'm curious about an honest comparison between this and arbitrary-subsets. What's easy, what's hard. Try modeling cases that matter to you as each, see how they fare.

nikomatsakis commented 12 years ago

I think @graydon summarized the situation pretty well. I definitely agree that alt check (that is, non-exhaustive alts) will not be ruled out, though there will likely be fewer of them.

For what it's worth, in my PhD work I built a compiler in Scala which does support arbitrary subset refinements. They do this by using a case class system (in fact, they invented the term) combined with multiple inheritance (traits). When we were initially discussing this, I thought that having support for arbitrary subsets was crucial: but when I looked back at my code, it turned out that every place where I had made use of arbitrary subsets had ultimately been converted to a tree shape. This is not to claim that arbitrary subsets have no use, just that there are many more cases where things do form a hierarchy. Of course, this is just one data point, and it was "gradware" to boot. But perhaps if we want to get a better idea of what the expressive power and limitations are in real-world code, we should look to Scala projects and see what they do with it.

catamorphism commented 12 years ago

@graydon Neither the macro nor the pattern-guard approach actually allows the compiler to omit checks safely, so far as I can see. (That is, even if you know you're matching on a FlowControl thing, either you or the compiler will have to include the _ case anyhow.) So I think we could do better.

I'd like to see a code example for the "partial support for refinements" proposal before I pass judgment on it.

graydon commented 12 years ago

@catamorphism the macro one would, assuming you mentioned enough subsets to totally cover the cases. And if you didn't, the compiler would tell you. That's why I said it gives static exhaustiveness checking. Of course the if-guarded one doesn't.

catamorphism commented 12 years ago

@graydon Re: the macro one, I meant where you know #FlowControl describes something, so you want to leave out the other variants in the enum.

(I'll confess that I'm not in a huge rush to implement anything that's been discussed on this page, so it doesn't seem like a pressing question. Unless someone else wants to do it.)

catamorphism commented 11 years ago

I would still like to implement something along these lines, but not until after 1.0.

alexcrichton commented 11 years ago

I was taking a look at #2896, and then I ran across this bug which looked really similar. Most of this syntax/discussion far predates when I came into rust, but it looks to be mostly outdated now. It seems as if this issue boils down to what #2896 wants to do, and if so one of them should probably be closed.

catamorphism commented 11 years ago

No, refinement types are different from pattern checking. The first one requires the programmer to make their intentions clear by writing their types differently. The second one is more about using fancy static analysis to let the programmer write something that looks like it might fail with a non-exhaustive match error, but potentially doesn't. Both are useful, but both are wishlist items that aren't required for 1.0.

In other words, refinement types are about giving programmers the tools to express invariants and prove them to the compiler; whereas tools like Catch are about saying "I know it's safe and I want the compiler to do the work of proving that."

emberian commented 11 years ago

@catamorphism is this still relevant even though typestate is gone?

bblum commented 11 years ago

This is a different feature from typestate.

emberian commented 11 years ago

Ah, reading the full thread this becomes obvious; I had glanced through some of the earlier comments that heavily involved typestate.

nikomatsakis commented 11 years ago

Here is my latest proposal for this, for reference:

http://smallcultfollowing.com/babysteps/blog/2012/08/24/datasort-refinements/

Also, I've been thinking that this would be fairly straightforward to implement and may be worth the trouble. There are a few thorny special cases that would go away if we could give None::<~int> a different type from Some(~3). For example, it'd be straightforward to make things like [None::<~int>, ..2] compile without error, whereas today you cannot handle this situation.

One concern I have is that while the design I laid out above is quite straightforward, it lacks the ability to parameterize types over the refinement. In other words, imagine I wanted to make a newtype'd option struct MyOption<T>(Option<T>). I'd lose the ability to distinguish MyOption(None) from MyOption(Some(x)). However, I guess you can work around this by doing something like:

struct MyOption_<T>(T)
type MyOption<T> = MyOption_<Option<T>> // usual case
type MyNone<T> = MyOption_<Option<T> {None}>
type MySone<T> = MyOption_<Option<T> {Some}>

Inelegant but perhaps adequate in practice. Anyway, I wanted to get this (rather old at this point) blog post linked to the issue.

pnkfelix commented 10 years ago

A change like this needs to go through the (relatively new) RFC process:

https://github.com/rust-lang/rfcs/

(And it is actually possible that one of the existing RFC's may cover the use case(s) documented here.)

I am closing this ticket; if anyone wants to revive the request for this feature, please write up a proper RFC for it.

ftxqxd commented 9 years ago

I think this issue should be moved to the RFCs repo (as an issue). cc @nick29581

steveklabnik commented 9 years ago

@P1start it's already been closed. As @pnkfelix says:

if anyone wants to revive the request for this feature, please write up a proper RFC for it.

ftxqxd commented 9 years ago

@steveklabnik Yes, but it was closed before we started having issues in the RFCs repo. I am under the impression that issues that should have RFCs written for them (such as this one) should be open issues in the RFCs repo, not closed issues in rust-lang/rust.

steveklabnik commented 9 years ago

@P1start so, my thought process was basically "It's not like we're gonna dig through all the old issues and move any one that's relevant over, let's wait until someone says something," but that ignores that, uh, here, you're basically saying something. So you're right. Sorry about that!

steveklabnik commented 9 years ago

I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized.

This issue has been moved to the RFCs repo: rust-lang/rfcs#754

rust-lang / rust

Add datasort refinements to Rust #1679