Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests

lastquestion commented 4 years ago

Feature complete, integration tests work, manually tested on 26 and 28, test ran on 26, 27, 28.

[x] Needs to handle debugger

Large PR to fix the measurement problem and at the same time separate the display abstractions. This thing needs:

[x] integration tests
[x] regression tests on existing bugs also
[x] test on emacs windowed
[x] test on 26, 27, 28

call-interactively

In #26, it turns out that hooking pre-command-hook is rife with troubles, because elisp can call it and we cannot be sure that the calls are matched. After a lot of thought, instead redesign the measurement engine to advise call-interactively instead. This gives much more accurate data for minibuffer, so we don't need the minibuffer hooks either. This work also fixes #18, because now call-inteactively time spent in native waits for keyboard input (from interactive specs) are excluded.

To do this, this PR introduces proper frame records and stack. I have some concern about the memory usage of this and how well the GC collects the unused branches of the tree.

Native frames are inserted whenever elisp is going to call into anything that might wait for keyboard or user input. They are also inserted in between timers and their callbacks, so that in all cases, native wait code has a parent frame to subtract time from. This resolves #31 as well.

streaming logging

This change is very large and rebuilds the measurement engine completely. In order not to regress existing bug fixes and also newly fixed bugs, a integration test suite that runs emacs in interactive mode is going to be needed. To enable this, a new feature is added. Logs of the command records entry/exit/profile/measurement can be streamed to arbitrary UNIX socket. The major intent of this is to enable diffing of measurement engine behavior in tests, but it will also allow for out of process visualization of emacs, which will be really useful when GC or blocking stuff happens; you can't see what's actually running right now because we're inside the same process, of course.

Move profiles to `explain-pause-top` and remove `explain-pause-log` and `explain-pause-profiles`

The log isn't actually super useful and the UX is not great for it anyway. Since the measurement engine rebuilds the command hook API, I took this chance to fix the display engines as well. Remove explain-pause-log and explain-pause-profiles' as well. Instead, profiles are displayed directly inexplain-pause-top. Now the limit of profiles applies per command. Whenexplain-pause-top` is first open, it backfills from the profiling statistics so even if you didn't have it open, it still knows about slow commands. Currrently, I don't have a good UX for native frames yet, so they are hidden from the top view. Fixes #39.

Also, while doing this, I had to move the click handlers, so add a new defcustom to allow customization of what actually happens when the profile button is clicked. Fixes #16.

explain-pause-top--table is extended even more to be even more generic and handle multiple extra lines of full width text after the column header line.

However, this drops the "parent" stack of each command, and combines them all together at the leaf. The previous UX allowed you to see separate parents if they came from timers. I removed this because I think I want to build a tree view for this that roots commands within the buffer that they were taken.

This removes the now unused logging defcustoms.

Testing

Add an integration test runner that runs emacs under tmux. Drive that emacs via another emacs that runs the test code. Drive all of that via Make. Add regression tests for all existing bugs that had repros that would not be covered by unit tests. There is still some areas that need testing, especially regarding the display layer which ought to be tested via batch emacs and reading a buffer, probably driven via buttercup.

The test suite can run against all the versions of Emacs supported, but it doesn't work for windowed emacs yet. Not sure I need to double the matrix there. If more bugs come at that layer I will consider adding it.

Random things

Also, rearrange deffaces into the right group.

lastquestion commented 4 years ago

Also fixes #19, partially. I think with some small fix it should fix it for both TTY menu bar kinds. Did not test on windowed yet.

lastquestion commented 4 years ago

1) I think I need to deal with file-watching now (see issue #3) because otherwise time is attributed to timers when it actually belongs to file notification updates from things like autorevert. This is way more noticeable in windowed mode for some reason. 2) I think I need to fix the timer callback stack trace problem (see issue #41) because when running very long emacs sessions (e.g. like everyone does), some timers end up with very deep stacks and since the parent pointer makes them completely live, GC can't free them.

lastquestion commented 4 years ago

Rebased to get the fix in #43 in master. Also, squashed the changes so far.

lastquestion commented 4 years ago

File notify events are sent as special key map entries in keyboard.c which ultimately become command_execute on file-notify keymap which binds to file-nofity-handle-event. So it will get measured correctly, except not correctly attributed to the actual callback, but the general file-notify code. However it should be easy to advice file-notify-add-watch similar to timers.

So the mismeasurement is somewhere else :-(

lastquestion commented 4 years ago

Above PR fixes #41. Also, I discovered that repeat timers often repeat a bunch of times on mac when AppNap is enabled (which it is by default on Mac). timer-max-repeats isn't bindable around a timer schedule, so move to rescheduling timer manually.

I made a comment about this in Bug#41865 https://lists.gnu.org/archive/html/bug-gnu-emacs/2020-06/msg00651.html in GNU emacs bug list.

lastquestion commented 4 years ago

Rebase to get fix for #44 and squash the history.

I have been working on reducing memory allocations for the refresh handler and cleaning up the top display code.

Going to add one more feature to show the times for the "slow executions" in a new line similar to profiles, clean up some more TODOs, fix some more bugs, and add the integration tests. This might be 2 more weeks at this rate...

lastquestion commented 4 years ago

The above commit fixes #46 in this branch and adds yet more tests. I'm probably not going to backport that fix.

lastquestion commented 4 years ago

debugger needs to be handled and then this PR is done.

lastquestion commented 4 years ago

🎉 let's merge. There's a few tests I'd like to add around displaying things but I will do that in followup PRs. This will be 0.2 for sure, the next job is to update README & make a screencast video.

lastquestion / explain-pause-mode