agda / agda

Agda is a dependently typed programming language / interactive theorem prover.
https://wiki.portal.chalmers.se/agda/pmwiki.php
Other
2.4k stars 339 forks source link

Instantiate terms before traversing them in tcExtendContext #7231

Open ncfavier opened 1 month ago

ncfavier commented 1 month ago

Fixes https://github.com/agda/agda/issues/7227. I narrowed the issue down to the freeIn call in tcExtendContext sometimes hanging and consuming unbounded memory:

https://github.com/agda/agda/blob/00f4bb499e049201317f56d60e0a5660f7b9681a/src/full/Agda/TypeChecking/Unquote.hs#L850-L852

I don't know what exactly causes this, but @plt-amy suggested on IRC to instantiateFull the term, and indeed that solves the issue. Here's an example of the reduction in size (note that the log file is truncated, you need to click "view the full file").

The problematic extend-context call is here: the returned value v in this case consists of three lists.

ncfavier commented 1 month ago

The weird thing is that reverting https://github.com/agda/agda/pull/4898 also fixes this... I'd expect strengthen to also traverse the term and have roughly the same performance characteristics as freeIn.

nad commented 1 month ago

I think we should avoid adding more calls to instantiateFull. Do you have benchmarks that show that adding instantiateFull here is "nearly always" a good idea? I also wonder why the call is added in tcExtendContext, and not somewhere else. Is this change tailored specifically to the 1Lab code?

ncfavier commented 1 month ago

I don't have any benchmarks; this change is based on an experiment with a sample size of 1. I agree that it would be nice to figure out what exactly is going on, and whether this change makes things worse in some cases.

Minimising the 1lab bug seems very hard as putting the same macro invocation in a different file does not cause an OOM, and due to the complexity of the macro. IMO the next step would be to look into whether/why freeIn loops on the uninstantiated term while strengthen doesn't.

I also wonder why the call is added in tcExtendContext, and not somewhere else.

I'm not sure where else it would go: it can't be earlier since it affects the result of the unquoted action we just ran, and it can't be later since the problematic freeIn is a pure function.

plt-amy commented 1 month ago

Keep in mind that the results of freeIn on uninstantiated terms are.. dubious, at best, even when they aren't explosive: if we have a solved meta ?0 := λ x y → y then the term ?0 @0 @1 does not depend on the zeroth variable, but freeIn 0 thinks it does.

andreasabel commented 1 month ago

Yes, there is no point in scape-goating instantiateFull. Inlining a meta solution is sometimes bad (unsharing, making terms bigger) and sometimes good (making terms smaller, making free variable analysis more precise). We need to quantify these "sometimes" and have some smarter criteria when we instantiate and when not.

nad commented 1 month ago

I'm not sure where else it would go: it can't be earlier since it affects the result of the unquoted action we just ran, and it can't be later since the problematic freeIn is a pure function.

Are there other parts of the code that would benefit from the same treatment?

ncfavier commented 1 month ago

@plt-amy pointed out that we should really be calling normalise instead of instantiateFull, as otherwise the check is bogus:

const : ∀ {ℓ ℓ'} {A : Type ℓ} {B : Type ℓ'} → A → B → A
const a b = a

macro
  test : Term → TC ⊤
  test _ = do
    extend-context "a" (argN (quoteTerm Nat)) do
      x ← unquoteTC {A = Nat} (var 0 [])
      pure {A = Nat} (const 0 x)
    pure tt

_ = test

{-
Local variable 'a' escaping in result of extendContext: const 0 a
when checking that the expression unquote test has type _1824
-}

I am also wondering why we're doing the raise-strengthen dance for extendContext but not inContext. It seems like it should be possible to get a variable to escape from inContext, but I haven't been able to trigger an error other than Cannot unquote non-canonical … @0... which makes me wonder if there's any point to this check at all. @L-TChen can you comment on this since you opened https://github.com/agda/agda/issues/3831?

Also, does anyone know of an actual use case for inContext? I can think of a use case for reverting to the empty context (see my unbind macro), but not really for the new behaviour since https://github.com/agda/agda/pull/5858/commits/f0ca16ab2876ef0661d39e09df28bb76e84c704d, which is to revert to the original context that the macro was unquoted in. Maybe @WhatisRT has something to say about this?

L-TChen commented 1 month ago

I am also wondering why we're doing the raise-strengthen dance for extendContext but not inContext. It seems like it should be possible to get a variable to escape from inContext, but I haven't been able to trigger an error other than Cannot unquote non-canonical … @0... which makes me wonder if there's any point to this check at all. @L-TChen can you comment on this since you opened #3831?

I don't know any use of inContext or its expected behaviour, so I did not discuss or change the behaviour of inContext in the aforementioned issue.

WhatisRT commented 1 month ago

I don't really know of a use for inContext. In stdlib-meta we have a wrapper around TC with a bunch of bells and whistles, one of them being that the context is managed as a Reader. This makes things much more flexible, and you can always get back to the context of the call site by just emptying the context. This is pretty easy to implement on top of the regular TC, so the only thing inContext gives you is the convenience of not having to write such a wrapper. But I think inContext will wreck the output when printing ErrorParts, so there's another downside.

The use the old inContext to go to some smaller context looks pretty hacky to me, i.e. you have to know about Agda's internals to understand the unbind macro. I can't think of a non-hacky use for that either.