Request: Ltac2 profiler like Ltac profiler

JasonGross commented 7 years ago

Either the Ltac profiler should support Ltac2 (ideal case), or there should be separate profiling for Ltac2.

samuelgruetter commented 3 years ago

I was looking for this feature too!

JasonGross commented 3 years ago

I wonder if it would make more sense to embed something closer to OCamlProf rather than adapting the Ltac2 profiler, given that Ltac2 is closer to OCaml than to Ltac in some sense. On the other hand, it might be enough to add, roughly, a hook to the beginning and end of the interpretation of each top-level Ltac2 definition.

samuelgruetter commented 2 years ago

As a note for anyone (including my future self in a few months from now :wink:) whose search for an Ltac2 profiler might end here: As long as we don't have one, wrapping some potentially slow calls inside Control.time can give some insights into timing.

JasonGross commented 2 years ago

Presumably this could be implemented quite similarly to the Ltac1 profiler. @ppedrot Do you think it would work add a do_profile wrapper to https://github.com/coq/coq/blob/7f2e4bfc0ec0e2e6ff24a91e9060933e06c31d46/plugins/ltac2/tac2interp.ml#L38-L45 from https://github.com/coq/coq/blob/7f2e4bfc0ec0e2e6ff24a91e9060933e06c31d46/plugins/ltac/profile_ltac.ml#L352 a la https://github.com/coq/coq/blob/7f2e4bfc0ec0e2e6ff24a91e9060933e06c31d46/plugins/ltac/tacinterp.ml#L1128-L1132?

Presumably the LtacProf infrastructure could be generalized over the notion of call stacks being used, and could just take a printer for the elements of the call stack?

@jfehrle Do you have time and interest in implementing this feature?

jfehrle commented 2 years ago

I'm open to working on it, but not right away. There are other things I want to complete first. And I wanted to get back to finishing the debugger, e.g. adding Ltac2 support but the review process is so tedious I have mixed feelings about that. The same consideration applies to this request. I wonder if we can get someone to commit to reviewing the change promptly if I agree take it on.

JasonGross commented 2 years ago

Assuming nothing else comes up in my life, I'm happy to review this change promptly (and also be the assignee). I think we might want to get @ppedrot or someone else with vision for Ltac2 to agree to review the interface between the profiler and Ltac2, possibly.

JasonGross commented 2 years ago

A good test-case for this: Porting ~110 loc of Ltac to Ltac2 added 30s overhead in https://github.com/JasonGross/fiat-crypto/commit/2fa728620a50dd2b4448d23fb16e13c9451e006e (https://github.com/JasonGross/rewriter/compare/f24c094d04d725e0f9f0554b354921d921a10f2e...73177609fdf0d088c576e3074ea4dcda077b3841). (Build target make TIMED=1 SKIP_BEDROCK2=1 pre-standalone in fiat-crypto, fiat-crypto archived at softwareheritage, rewriter archived at softwareheritage) As I said on Zulip, my current guess is that it's the overhead introduced by the four extra evars involved in replacing

lazymatch ty with
| base_interp ?T => T
| @base.interp base base_interp (@base.type.type_base base ?T) => T
| @type.interp (base.type base) (@base.interp base base_interp) (@Compilers.type.base (base.type base) (@base.type.type_base base ?T)) => T

with

(* work around COQBUG(https://github.com/coq/coq/issues/13962) *)
lazy_match! '($base_interp, $base, $ty) with
| (?base_interp, ?base, ?base_interp ?t) => Some t
| (?base_interp, ?base, @base.interp ?base ?base_interp (@base.type.type_base ?base ?t)) => Some t
| (?base_interp, ?base, @type.interp (base.type ?base) (@base.interp ?base ?base_interp) (@Compilers.type.base (base.type ?base) (@base.type.type_base ?base ?t))) => Some t

but I'm not sure how to go about debugging these issues systematically without a profiler.

JasonGross commented 2 years ago

Another use-case: in https://github.com/mit-plv/fiat-crypto/pull/1358#issuecomment-1267374005 we have Ltac2 reification taking almost 2 hours, and I don't have a good way to debug what's slow. The code can be built by checking out https://github.com/JasonGross/fiat-crypto/tree/xxx-ltac2-slow-reification-for-coq-coq-10111 and running make TIMED=1 SKIP_BEDROCK2=1 src/PushButtonSynthesis/SolinasReductionReificationCache.vo

coq / coq

Request: Ltac2 profiler like Ltac profiler #10111