RFC: global diagnostics

mattrobball commented 1 month ago

Proposal

Meta.Diagnostics is a useful tool for assessing problems local to a single declaration, but there are currently no analogous tools for assessing global impact.

We propose creating an environmental extension (or extensions) which tracks the applications and successes of simp theorems and typeclass instances. This could be expanded to include other aspects that are currently tracked in Diagnostics like unfolding of semireducible terms. Eventually heartbeat tracking would be very nice.

Why? Often, the overall of impact of an issue can be quite substantial with each individual impact being difficult to notice.

For example, a simple minded version of this found an expansive list of simp theorems that were often applied across mathlib but which never unify.

Knowing during the review process or shortly after whether a new simp theorem or instance is likely to cause expensive detours would help to improve design.

The simplest start would be to just write the data from Diagnostics to the environment. Hiding this behind a flag, like with diagnostics itself, would avoid a general performance penalty. These need to only run periodically similar to debug flags for common build processes.

Most everything is available in diagnostics for simp theorems already. Tracking which instances often side-track typeclass synthesis would need some new logic and possibly new state. But this need not be part of the initial release.

User Experience: How does this feature improve the user experience? This provides the user with a new tool to understand their code and help adapt it to the existing infrastructure. Hiding it behind an option flag avoids the need to a user to interact with it without explicit effort.
Beneficiaries: Which Lean users and projects benefit most from this feature/change? All users would benefit though perhaps most benefit would accrue to library maintainers. Casual users can get feedback on their code not currently possible. Maintainers of large libraries can validate the quality of code more efficiently. Even developers of core would benefit from the feedback upstreamed about anti-patterns. In the example linked above, you can see which keys are problematic for example.
Maintainability: Will this change streamline code maintenance or simplify its structure? A tool with hard data would help make the performance dimension of maintainability less prone to speculation. I don't foresee a significant maintenance burden for the global diagnostics themselves. It is likely with global diagnostics promoting better design that requests to refactor complex parts of the core can be avoided more frequently.

Community Feedback

The desire for such global diagnostics first arose during a Mathlib maintainers meeting where it received strong support. It was then put to Zulip for further community discussion.

Impact

Add :+1: to issues you consider important. If others benefit from the changes in this proposal being added, please ask them to add :+1: to it.

Kha commented 1 month ago

An environment extension is a no-go for parallelism. Why can't some external tool just sum up all the individual reports? This seems like something that could easily (at first) be implemented outside of core.

mattrobball commented 1 month ago

@Kha Thanks for the response. Could you elaborate? For simp, I didn't use the built in diagnostics but duplicated each update to Meta.Diagnostics with another update to the environmental extension. I didn't notice any particular slowdown nor any changes in the counts per simp theorem in a few builds. What am I missing?

One thing I forgot to mention: there is one aspect that currently makes what I said infeasible for instance synthesis. The Lake version string interpolater v! uses evalExpr while doing synthesis.

Kha commented 1 month ago

I didn't notice any particular slowdown nor any changes in the counts per simp theorem in a few builds.

That would be because parallelism hasn't landed yet :) . It's coming though, thus the caveat.

mattrobball commented 1 month ago

Good to know my sanity is still intact.

Do you of any model(s) similar, even in spirit, to the approach you suggest for the external tool?

Kha commented 3 weeks ago

@mattrobball The speedcenter sums up --profile outputs, does that count? https://github.com/leanprover-community/mathlib4/blob/master/scripts/bench/accumulate_profile.py

mattrobball commented 2 weeks ago

I think I need to capture MetaM state for this. (Waive me off if you think this is bad.) Is that stored anywhere?

Kha commented 2 weeks ago

I don't think I'm following, do you need more information than what is already printed by -Ddiagnostics=true, just accumulated across files?

leanprover / lean4