Tracking issue for const_eval_limit: configurable, per-item instruction limit for CTFE

ecstatic-morse commented 4 years ago

Currently, the const evaluator will continue forever if the program it is executing never terminates. There's a mechanism for detecting trivial infinite loops during const-eval, but this can't catch every non-terminating program.

We should have a configurable instruction limit for CTFE that's set to a relatively low initial value. Furthermore, disabling the instruction limit should also disable the infinite loop detector, since the user has indicated that they expect their program to execute for a long time.

The existing #![recursion_limit] and #![type_length_limit] attributes are somewhat related, but it would be ideal if the instruction limit could be applied per-item as well.

cc @rust-lang/wg-const-eval

As of 2022-01-31, the default const_eval_limit is 1_000_000: https://github.com/rust-lang/rust/blob/af46699f8104ba5257d0da56d3d817bf8fc751cf/compiler/rustc_middle/src/middle/limits.rs#L40

TheSamsa commented 4 years ago

I'd like to give it a go, although I will have to dig into this matter first to better understand it. I will be grateful for any hints, where to start.

Wouldn't it help to make #![recursion_limit] applyable on item level? Also the instruction limit you speak of is for the CTFE not the compiled code, isn't it? Want to clear that, before it gets mixed up.

ecstatic-morse commented 4 years ago

Upon reflection, this is more like E-medium. I still think it's a good first issue because it doesn't require too much familiarity with the compiler.

https://github.com/rust-lang/rust/blob/a2333023594826dc5078ce7129fafb85471b7857/src/librustc_mir/const_eval.rs#L32

Is where the instruction threshold for the infinite loop detector is hard coded. Search for where this constant is used, and it will take you to the spot where the count can be checked. Note that terminators, not statements are being counted.

recursion_limit is pretty deeply woven into the compiler. It would not make sense per-item. It is not a recursion limit for CTFE.

Centril commented 4 years ago

Before implementing anything I think there's need of a discussion wrt. what the language design should be here, at least if this is going to be user-facing.

TheSamsa commented 4 years ago

Of course!

RalfJung commented 4 years ago

We should have a configurable instruction limit for CTFE that's set to a relatively low initial value. Furthermore, disabling the instruction limit should also disable the infinite loop detector, since the user has indicated that they expect their program to execute for a long time.

If we have an instruction limit, why do we need a loop detector at all?

FWIW, I'd not count instructions but blocks -- each block is of finite length and this way we are somewhat less sensitive to long swaths of straight-line code.

ecstatic-morse commented 4 years ago

If we have an instruction limit, why do we need a loop detector at all?

We wouldn't. In the medium term, I would like to remove it. In the short-term, I think #67216 should be blocked on providing some recourse to the user when their code runs long enough to trigger the detector. It seems like an attribute in the style of recursion_limit–as opposed to a -Z flag–is the right solution, regardless of the future of the loop detector. The per-item part doesn't have to be part of the MVP.

RalfJung commented 4 years ago

We wouldn't. In the medium term, I would like to remove it. In the short-term, I think #67216 should be blocked on providing some recourse to the user when their code runs long enough to trigger the detector.

We have such a recourse right now, in the form of the current loop detector, right?

If you think that's not good enough, implementing a limit on the number of basic block should be fairly trivial -- and then removing the loop detector also isn't very hard. So I am not sure what big piece of work you are planning here that mandates a short-term strategy with further medium-term adjustments?

ecstatic-morse commented 4 years ago

I mean that, if I have a very expensive const initializer that I know will terminate, I want a way to run it to completion without traversing the heap every 2^8 basic blocks.

Whatever system of attributes we end with will require an actual design, even if it's not hard (some might even say E-easy). Meanwhile beta branches in about a week (I think), and I'd like to avoid another cfg(bootstrap) mess.

RalfJung commented 4 years ago

But it seems you want to land an instruction/block counter before the beta branches? At which point the loop detector becomes redundant. But you want to keep it anyway? I don't get it.^^

ecstatic-morse commented 4 years ago

I'd like to do something quickly that's better than a -Z flag. If we can reach consensus around removing the loop detector, let's do it, but that's not the blocking part.

RalfJung commented 4 years ago

Okay, so the plan of action is this?

Quickly land something that doesn't break current stable.
Tweak a bit if needed, then do an FCP to officially accept that as our termination check.
Remove the loop detector.

I guess I don't see why step 3 can't be made together with step 1, but I also don't really care strong enough to push for that.^^

ecstatic-morse commented 4 years ago

Step 1 is more about picking the syntax for the attribute, while step 2 would be to fully specify its semantics. If we can't settle on syntax in the next week, I'll just add a temporary -Z flag to #67216. If we can settle on syntax, I'll use the attribute to set the threshold for the loop detector/turn it off, and we can remove it later.

A strawman proposal is #![rustc_const_eval_instruction_limit = "100"] with "-1" as the sentinel that turns off the instruction limit/loop detector. This would mirror the existing limit attributes, but it doesn't take advantage of the fact that integer literals are now allowed in attributes.

I think we should use the word "instruction" (or something similar) in the attribute, even though we are actually counting basic blocks.

ecstatic-morse commented 4 years ago

By the way, I'm assuming that we will put the attribute itself behind a feature gate. We don't have to stabilize something in a week, just get something merged.

RalfJung commented 4 years ago

I think we should use the word "instruction" (or something similar) in the attribute, even though we are actually counting basic blocks.

Other suggestions: const_eval_step_limit, const_eval_limit,const_eval_exec[ution]_limit.

But yeah, this would be unstable for now anyway.

oli-obk commented 4 years ago

I don't see how we can backwards compatibly add these (unless we add them only if the code has a loop or branch anywhere, which seems like the only important case). Because if the user has compiling code right now that will trigger the limit when introduced, this breaks their code

oli-obk commented 4 years ago

But yea, please add the unstable attribute so it gets into beta, even if it doesn't do anything yet

ecstatic-morse commented 4 years ago

Okay, this comment assumes that everyone is comfortable with adding some kind of attribute to limit the time spent in const-eval behind a feature flag and hashing out the exact semantics later. Please object if this is not the case.

@TheSamsa, my intention is still to mentor someone through implementing the attribute, although the time pressure is not ideal. Do you have some time to spare this week?

For anyone who wants to take this on, the first step would be to add a new feature gate at the end of the following list that uses a descriptive name. I would create a new tracking issue specifically for the attribute, since this issue is too broad.

https://github.com/rust-lang/rust/blob/90b957a17c1abba979aa41234ce0993a61030e67/src/librustc_feature/active.rs#L526-L531

You'll want to add a symbol for the feature gate, as well as the name of the attribute itself. I believe the name of the attribute must begin with rustc for backwards compatibility reasons. Pick one of Ralf's choices above.

https://github.com/rust-lang/rust/blob/90b957a17c1abba979aa41234ce0993a61030e67/src/libsyntax_pos/symbol.rs#L111-L117

You'll then need to actually parse and store the attribute. I don't have detailed instructions for this, but looking at the code that handles #![recursion_limit] would a good place to start. Note that this will only work at the crate root, not per-item. This is fine for now. Make sure the attribute is only parsed if the feature gate is enabled.

Once this is complete, I can instruct you how to incorporate this information into the compile-time interpreter.

ecstatic-morse commented 4 years ago

Also, instead of having a sentinel value, we could have a proper lint for long-running const-eval that defaults to warn. #![allow(long_running_const_eval)] would disable the loop detector/step counter.

TheSamsa commented 4 years ago

@ecstatic-morse I'd really really like to takle it. BUT I am not sure if I understand the parts good enough to wrap this up in a week. Besides I guess mentoring could be harder due to timezones (UTC+1 here) So if anyone jumps in, it's fine!

Nonetheless, I'll try my own implementation (for my learning) for the first steps the next day or two.

Centril commented 4 years ago

For context, the RFC states that:

To ensure that the compiler never hangs indefinitely, we count the number of terminators processed and whenever we reach a fixed limit, we report a lint mentioning that we cannot guarantee that the evaluation will terminate and reset the counter to zero. This lint should recur in a non-annoying amount of time (e.g. at least 30 seconds between occurrences). This means that there's an internal deterministic counter (for the terminators) and a timestamp of the last (if any) loop warning emission. Both the counter needs to reach its limit and 30 seconds have to have passed since the last warning emission in order for a new warning to be emitted.

The existing #![recursion_limit] and #![type_length_limit] attributes are somewhat related, but it would be ideal if the instruction limit could be applied per-item as well.

I would prefer to start out with a coarse per-VM granularity and considering per-item as needed. I also don't have an intuition for what per-item means in the face of mutually recursive functions (where e.g. both functions use #![const_limit = ...]).

FWIW, I'd not count instructions but blocks -- each block is of finite length and this way we are somewhat less sensitive to long swaths of straight-line code.

I agree with @RalfJung's idea here and it seems consistent with what the RFC text ^-- states.

In the short-term, I think #67216 should be blocked on providing some recourse to the user when their code runs long enough to trigger the detector.

I don't understand why we need to block #67216 (especially for beta bootstrap) on having this attribute in nightly as well. I would prefer not to rush things. We can still block stabilization of const_loop on said attribute.

I think we should use the word "instruction" (or something similar) in the attribute, even though we are actually counting basic blocks.

I would prefer something vague and brief, e.g. @RalfJung's suggested const_eval_limit or perhaps better yet, #![const_limit].

I don't see how we can backwards compatibly add these (unless we add them only if the code has a loop or branch anywhere, which seems like the only important case). Because if the user has compiling code right now that will trigger the limit when introduced, this breaks their code

@oli-obk It has to be a lint to be backwards compatible as @ecstatic-morse suggests:

Also, instead of having a sentinel value, we could have a proper lint for long-running const-eval that defaults to warn. #![allow(long_running_const_eval)] would disable the loop detector/step counter.

I see this as a two-step thing:

You can configure the limits for when the lint triggers with #![const_limit = $lit]. The default limit without #![const_limit = ...] should be set to something that equals 30s on a reasonable machine (let's say some laptop with a 8 core Intel CPU).
You can get rid of the limit entirely with #![allow(...)].

Please object if this is not the case.

I'm not, but I've nominated this issue for language team discussion to ensure there's agreement about the general direction.

oli-obk commented 4 years ago

Denying a lint won't suddenly stop the loop detector from running. This will require some additional logic beyond that. We'll also need to take care to report the lint on the root of the evaluation and not e.g. somewhere in another crate if calling const fns from other crates.

RalfJung commented 4 years ago

Wait I think the loop detector is going away once we count terminators instead? We could optimize things to not even count terminators when the link is allowed (not denied?), but that seems like a microoptimization.

nikomatsakis commented 4 years ago

I'm removing the nomination here because there doesn't seem to be a specific lang-team topic. If y'all would like some input, can somebody summarize the precise question? For example, should we evaluate #67260? If so, can someone summarize the design implemented there?

RalfJung commented 4 years ago

Status update: https://github.com/rust-lang/rust/pull/67260 landed, but from what I can see that is a global limit, not per-item as the issue here tracks. It's just the first step, then.

ecstatic-morse commented 4 years ago

I'm gonna nominate this issue for @rust-lang/lang discussion, since this is blocking #52000.

At the moment, we have a global const-eval limit that will execute a fixed number of MIR basic blocks before emitting a const_err lint. The #![const_eval_limit] attribute (#67260) overrides this limit for a single crate (similar to how #![type_length_limit] works). #![const_eval_limit = 0] disables the limit entirely, allowing const-eval to run forever within that crate.

In the issue title and summary, I mentioned a "per-item" limit. I had envisaged this working the same as the current, "per-crate" limit, except that each const or static could be annotated with #[const_eval_limit], which would override the default limit as well as the per-crate one. It would have no effect on const fn, since these are only part of `const

@nikomatsakis has expressed the desire for this to work on const fn as well. That way, a publicly exported const fn that is expected to run for a long time could be called by downstream crates without them needing to set #[const_eval_limit] at all.

Which of these variations do people prefer? Personally, I'm still partial to the version that works on const and statics but not const fn ("per const-eval root"?) due to ease of implementation. I would need to hear more specifics about a version that worked with const fn before changing my mind. Does the callee limit override that of its caller or add to it? How would this work for simple or mutually recursive functions?

oli-obk commented 4 years ago

It should be easy to make this work on const fn, as we can just fetch the attribute at push_stack_frame time and add it to the current limit. If the value is chosen too big, this will just blow up the limit though. Also it would make shrinking the value a breaking change.

Maybe what would make more sense is to add it to the current limit and revert to the previous step value after executing a function that has a limit override. This way both the function can enable itself to be easier to execute, and the user can bump the value futher in case the function screwed up the limit.

ecstatic-morse commented 4 years ago

The problem with adding up limits is that it can result in no effective limit for recursive const fn that set a limit slightly higher than the number of basic blocks they would execute. This number is likely to be very small compared to the default global limit.

There's also the problem of explainability. Limits at the crate or const-eval root have a clear interpretation (attribute at crate root overrides default, attribute on const or static overrides crate root), but the ability to add or overwrite limits at each stack frame makes thing difficult to reason about for me. For example, if I disable the const-eval limit in my crate with #![const_eval_limit = 0] but call a const fn with a limit in a downstream crate, which one wins? I suppose this is only a problem if callee limits overwrite that of their caller.

For the record, I think it would be possible to disallow #[const_eval_limit] on const fn for now to leave this design space open while allowing #52000 to move forward. That's not worth doing if we can settle on clear semantics but should be kept in mind.

RalfJung commented 4 years ago

One thing we could do is, when we enter a function that has a per-function limit and that function is not yet on the call stack (to handle recursion), we store the current limit... somewhere, and then we make that function's limit the new limit. When this function returns, we restore the old limit. That avoids changes to this limit, and indeed any changes to what this function does, from "leaking" to the callers.

The main issue I see with this scheme is higher-order functions: .map(|x| do_stuff_with(x)), if map has a limit, will run do_stuff_with(x) with that limit being active.

oli-obk commented 4 years ago

That avoids changes to this limit, and indeed any changes to what this function does, from "leaking" to the callers.

I considered that, but that would mean that library authors could set a low limit to the function, get it wrong, and then callers have no control over setting the limit higher.

nikomatsakis commented 4 years ago

OK so there are a few considerations, I guess.

One of them is that the caller may supply some input that requires a lot of execution, and hence the callee may not know that an annotation would be required (because they only ever used the fn with small inputs).

The other is that, as I frequently suspect, the callee may know that the fn is very computationally heavy and thus needs a "high limit". However, it's true that it's may be hard to determine an appropriate limit without knowing what inputs the caller are using.

My concern was initially about the latter point: I think the experience with macro-rules is that people often define "complex macros" and then have to add to their annotation that all consuming crates have to add these esoteric annotations, which seems suboptimal. But the former point is very true.

This suggests though that something like what @RalfJung suggested -- where we remember the limit as we enter a function -- might be good, but perhaps with a "max" instead? i.e., we remember a max, and we pop it when we return from the function to the old max, but the limit never goes down?

ecstatic-morse commented 4 years ago

This was discussed at the @rust-lang/lang meeting.

One concern was that, by tying the step limit to MIR basic blocks, we make it possible for changes to MIR building or optimization to break code. One way to mitigate this is to limit only the number of function calls and jumps to already executed basic blocks as opposed to all basic block terminators. My concern with this change was that there would be even less correlation between the wall-clock time spent in const-eval and the const_eval_limit. However, others pointed out that it is possible to separate out the general "UI responsiveness" concern (why is rustc taking so long?) with a clock-based timer separate from the instruction limit. The instruction limit would be responsible exclusively for guaranteeing that some forward progress was being made.

Separately, there seems to consensus that making this attribute work per-function is worth doing. The specifics were not decided on, but something like Ralf's approach above seemed promising.

RalfJung commented 2 years ago

There is some relevant discussion happening at https://github.com/rust-lang/rust/issues/93481

Mark-Simulacrum commented 2 years ago

Discussed in T-lang backlog bonanza -- tagging with design concerns, discussion of the concerns raised in https://github.com/rust-lang/rust/issues/93481#issuecomment-1025972131 needs to be driven to conclusion. One suggestion brought up in bonanza was a timeout (for const fn or just general compilation) at 10 minutes or 30 minutes or whatever.

rust-lang / rust

Tracking issue for const_eval_limit: configurable, per-item instruction limit for CTFE #67217