JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.03k stars 5.43k forks source link

RFD(esign): Representations of proofs and proof obligations #49273

Closed Keno closed 7 months ago

Keno commented 1 year ago

The problem statement

We're starting to see a bit of a proliferation of uses of @assume_effects. This isn't really a problem in general, but the semantics here a subtle and getting it wrong is undefined behavior. As such, it would behoove us to think about tooling to verify these assumptions wherever possible. Of course, you might say, "well, if the compiler could verify the effects, we wouldn't need to assume them". That is the correct position of course, but not all @assume_effects are due to limitations that we ever expect to lift. For example, the FMA consistency annotation:

https://github.com/JuliaLang/julia/blob/1bf65b99078777325b5997e270d729f3cf7cd6f3/base/math.jl#L49-L55

requires a fairly complete and symbolic model of IEEE floating point semantics, that is beyond the scope of what I'd expect the compiler to ever have in the default compilation pipeline - it is more in the domain of proof assistants and model checkers. Ideally, we'd be able to have these kinds of tools as packages that can then be used to either find or check proofs of our assumptions (presumably on CI, since these kinds of checks can be expensive).

Prior art / the research question

I've taken a look at something like https://github.com/xldenis/creusot, which (as far as I understand) compiles Rust code to Why3's mlcfg representation (which can then generate proof obligations in a whole host of other tooling). It seems relatively straightforward to do the same thing for Julia by taking advantage of the usual IR-agnostic compilation trick that we do, but I think that's only half of the story, addressing the mechanism of how something like this might work, but not really the semantics, so that's kind of where this issue comes in.

I would like to pose to the question to the larger julia/PL community: What - if anything - should the semantics of proofs and proof obligations be in the core Julia system. I have some initial thoughts here, which I will post below, but I'm assuming there's a lot of standard approaches and tricks here that I'm just not familiar with, so I'd like to leave this fairly broad for suggestions.

My own thoughts

I think it would be prudent to have some sort of primitive representations of proofs within the core system. This doesn't mean that this system would actually be used for proof checking, but it seems necessary to at least have some object that represents true statements. That way, if you can construct such an object (or prove that the function that constructs it is :total and thus would construct the object if it were ran), you can assert the truth of something. For example, could we have something like:

struct NoThrowObligation <: ProofObligation
    mi::MethodInstance
    world::UInt
end

struct Proof
     what::ProofObligation
    # Proof construction using Julia compiler builtin
    function NoThrowProof(obl::NoThrowObligation)
        Base.infer_effects(obl.mi, obl.world).nothrow || error()
        new(obl)
    end
    # Proof construction from composition
    function NoThrowImplies(obl::NoThrowObligation)
        # Compiler primitive that does `infer_effects`, but also generates a `NoThrowObligation` for every
        # call it couldn't prove `:nothrow` for
        obls = Base.infer_effect_obligations(obl.mi, obl.world).nothrow
        obl, proofs->begin
            all((p, o)->p.what === o, proofs, obls) || error()
            new(obl)
        end
    end
    # Other constructors monkey patched by external packages, e.g. the why3 construction.
end

These objects could then be exchanged for checking/assumptions, etc.

This design is obviously bad, but that's because I don't really know what the good designs in this world are. I assume there's a bunch of standard tricks to lift primitive notions of equality and composition into the proof domain, so I'd love to get some input on the core principles of doing this kind of thing, keeping in mind that this would be an entirely optional part of the language.

Keno commented 1 year ago

I've gotten some requests to expand upon what I'm asking for people who may not be versed in the latest Julia features, so let me try to provide some background. If there's anything else people would like me to elaborate on, please let me know.

  1. What is @assume_effects?

The julia compiler needs to, at various stages, prove certain properties of the functions its analyzing. For example, it can be helpful to know that a function does not throw, is guaranteed to terminate, and does not have other observable side effects, because that means that if the result of the function is unused, we may immediately delete it without further analysis. This kind of thing is similarly achieved with attributes in LLVM and it seems similar to the effect system in Verse, although I only saw a passing reference to that recently and have not dug in in detail. To achieve this, we interprocedurally converge the effects for a function along with regular type inference.

The full semantics of these effects are listed in the documentation: https://docs.julialang.org/en/v1/base/base/#Base.@assume_effects

They can be queried with infer_effects:

julia> Base.infer_effects(+, Tuple{Int, Int})
(+c,+e,+n,+t,+s,+m)

julia> Base.infer_effects(println, Tuple{String})
(!c,!e,!n,!t,!s,!m)′

(As you might imagine, this means that + is the most pure a function can be, println the least - again, see docs for specific semantics).

However, while the compiler is reasonably sophisticated at inferring these (based on regular abstract interpretation with relatively sophisticated transfer function for our primitives), it is of course not perfect. As a result, there is @assume_effects, which allows overriding the compiler's judgement of the effects of a particular function. Consider for example, the function I had above:

 @assume_effects :consistent @inline function two_mul(x::Float64, y::Float64) 
     if Core.Intrinsics.have_fma(Float64) 
         xy = x*y 
         return xy, fma(x, y, -xy) 
     end 
     return Base.twomul(x,y) 
 end 

The :consistent effect, requires, among other things, that the result be independent of the execution environment. Here, have_fma, reads the processor capabilities, which is an execution environment query. Thus, this function is consistent if and only if ∀x,y (xy, fma(x, y, -xy)) === Base.twomul(x,y) (where === is egality, the strongest of our equality predicates, which in this case requires bitwise equality of the two floating point numbers). This is not something our compiler can determine itself, because it does not have a sufficiently strong formal floating point model. However, this is certainly within the capabilities of other tools (e.g. I'm sure Alive2 could prove it if we gave it the LLVM version of this), so it would be nice to be able to have a mechanism/semantics for the compiler to keep track of the its assumptions and offload them to external tools if possible (probably as a CI step).

Getting the effects wrong (e.g. annotating something as :consistent, when it actually isn't) is undefined behavior and as we exploit these effects aggressively can cause quite a bit havoc.

  1. Are you looking for a formalization of Julia?

Not really. It is certainly true that in general, you'd need to have some sort of formalization of julia semantics in order to prove e.g. the absence of the possibility of method errors. However, the Julia compiler already does a lot of this, and the goal of this effort is not to verify the julia compiler. The goal is to give users additional tooling to verify their own code so they can use things like @assume_effects more safely (because the semantics are subtle and very easy to get wrong, even for experienced developers). For example, the two_mul case above is entirely monomorphized by the julia compiler, and the (remaining) required proof is entirely in the floating point domain. Of course, if we build some interesting infrastructure here, formalizing more things about Julia could be an interesting future direction.

  1. Are you trying to add a static type system to julia?

Depends on what you mean, but not really. In particular, I'm not looking for syntax additions, etc that might have semantic impact on what julia code means. I'm looking for ways to make it easier to apply formal methods to existing julia code. I think the lines get a bit blurry when you get to really sophisticated type systems like you find in theorem provers where types are mostly written implicitly by automation, which is probably similar to what would happen here, but the key point is that I'm not proposing any changes to language semantics?

  1. Are you trying to build an interactive theorem prover?

Again, not really. I do think that in order to this well, you need to have some basic primitives in the language itself that can then get translated out to the actual proving environments. That probably does include a basic proof checker, so you can do basic compositional stuff mechanically without having to go through the overhead of going out into another environment. However, turning Julia into a theorem prover is not my goal - I want to leverage what already exists. If somebody wanted to take whatever comes out of this as a base and build a full theorem proving environment that is more integrated with Julia, I'd welcome that, but that's not my goal.

Seelengrab commented 1 year ago

Love to see this! In a sense, any function that the compiler can statically infer the type for is already a proof in and of itself, our compiler "just" isn't smart enough to gather the invariants as part of the inferred type. For example, this function:

f(x::Int) = x - (x % 2)

Only infers as Int, when it could infer as (pseudo) EvenInt (this could be represented by metadata as well), with postcondition "The LSB is 0". This program is then a proof of the function Int -> EvenInt, since it typechecks for that signature. Conversely, this function:

function myfunc(x::Int)
    iseven(x) || throw(ArgumentError("x not even!"))
    x*2
end

Only requires Int and does a runtime check for its precondition that x should be an even number - in order for the function to terminate, the actual type of the input argument could be inferred as EvenInt (like a kind of reverse-inference - what kinds of properties must an incoming object have such that the function is guaranteed to terminate in a regular manner?). The function itself is then a proof with postcondition For all x from EvenInt: myfunc(x)::EvenInt (you could also write this as an entailment EvenInt => EvenInt, meaning IFF the input is an EvenInt instead of just an Int, you're guaranteed to get out another EvenInt. If it isn't, anything might happen).

So - are you primarily asking about datastructure/algorithmic design and representing that on the frontend, or asking about prior art on how to do proof extraction (i.e., extracting pre-/postconditions and/or invariants) and checking them against a provided spec?

There's also the question of what kinds of proofs/specs ought to be representable. I think for a lot of things, abstract interpretation itself can help here, not by using it to find effects but to, for example, aggregate memory allocations to give a lower bound on how much memory is needed. This gets messy quickly, since the same kinds of things that make regular type inference hard are also the kinds of things that blow up any kind of bounded value inference, like the question about getting a lowerbound to memory. Post-/Preconditions that involve multiple execution environments/tasks are going to be tricky - this is more or less equivalent to safety and liveness.

You mention "proofs by composition", which is an active research question (the main issue is unintentionally violating internal invariants of other specs when composing). Hillel Wayne has lots of interesting publications about that and using TLA+ to model programming environments. Currently though, most formal modeling I know of takes place as an additional, extra thing next to the program you're modeling.

Finally - as far as I'm aware, only the model checking part is really somewhat well researched. Extracting a model of any kind from an existing language is untrodden ground, to say the least - the closest/most generic I can come up with OTOH is Hoare Logic, which deals with pre-/postconditions (and seems to be what the interface in creusot is based off of), but most research about that and similar logics I know of are more concerned with the model checking part than the "how did we end up with this model in the first place" part. We already do kind of model this, with input argument types being preconditions and return types (and some effects) being postconditions, as aluded to above.

Keno commented 1 year ago

So - are you primarily asking about datastructure/algorithmic design and representing that on the frontend, or asking about prior art on how to do proof extraction (i.e., extracting pre-/postconditions and/or invariants) and checking them against a provided spec?

Mostly the former, but I'm not sure "frontend" is the correct way to say it. As you said, the compiler already proves various things by making typing judgements over the extended inference+effects lattice, but there isn't really any first class representation of this that could serve as the central interchange between the compiler and external tooling. The question is basically what does that interchange point look like. Is it something like what I wrote above, or one could imagine just directly saying something like:

struct Judgement
    mi::MethodInstance
    world::UInt
    typ # some type in the extended type lattice
end

taking advantage of the recent support we have for pluggable type systems in the abstract interpreter and as in my proposal above restrict construction of these things to only allow the construction of true judgements. I very much think of this as sort of a "bootstrapping" question, i.e. how do you turn the primitive notions of equality and code that we have into something that you can start to reason over. I have a very good sense of how to do that sort of thing for plain execution, but since I've never worked on proof system implementations, I just don't have any good design intuition here.

timholy commented 1 year ago

If the point is to check whether packages are using @assume_effects properly, can you explain in greater detail why this can't be in a package?

MasonProtter commented 1 year ago

@timholy I think the whole point here is to come up with an interface precisely so that this can be done in a package without being overly entangled with the nitty gritty compiler internals.

Seelengrab commented 1 year ago

I just remembered that this issue exists, so if you're still looking into this @Keno , the dafny language is probably the closest to a complete description of the sorts of things & challenges that arise in this domain. They also have a, to my eyes at least, good interface.

As for whether this can be done in a package or not - it's possible, but should (after A LOT of maturing) be transitioned to Base.

Keno commented 1 year ago

Yes, I'm quite familiar with Dafny and agree that it's a good design point for the interface side of things.

Keno commented 7 months ago

This issue served its purpose as having a place to point people to when asking questions. Closing in favor of more focused discussion on concrete proposals in the future.