akinomyoga / ble.sh

Bash Line Editor―a line editor written in pure Bash with syntax highlighting, auto suggestions, vim modes, etc. for Bash interactive sessions.
BSD 3-Clause "New" or "Revised" License
2.32k stars 77 forks source link

Always one frame (16.7ms) delay between pressing enter and running command #447

Open pallaswept opened 1 month ago

pallaswept commented 1 month ago

ble version: 0.4.0-nightly+5676525 Bash version: 5.2.26(1)-release (x86_64-suse-linux)

GNU bash, version 5.2.26(1)-release (x86_64-suse-linux) [openSUSE Tumbleweed]
ble.sh, version 0.4.0-nightly+5676525 (noarch) [git 2.43.2, GNU Make 4.3, GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)]
bash-completion, version 2.12.0 (hash:480ffcc6a751e55621ec526eb5dea7a0d86d9e72, 17877 bytes) (noarch)
fzf key-bindings, (hash:a789b19de9ea0e6dd21dd3f35e36c4b9710d73a8, 5488 bytes) (noarch) (integration: on)
fzf completion, (hash:e01b3005c3aed07d6fd073993b9ae37ef97be005, 16103 bytes) (noarch) (integration: on)
starship, version 1.19.0 (rustc 1.77.2 (25ef9e3d8 2024-04-09) (built from a source tarball), 2024-05-16 03:38:19 +00:00)
zoxide, version 0.9.4 (/usr/bin/zoxide) (integration: on)
locale: LANG=en_GB.UTF-8 LC_ADDRESS=en_AU.UTF-8 LC_MEASUREMENT=C LC_MONETARY=en_AU.UTF-8 LC_NAME=en_AU.UTF-8 LC_NUMERIC=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8 LC_TELEPHONE=en_AU.UTF-8 LC_TIME=C
terminal: TERM=xterm-256color wcwidth=15.0-west/15.1-2+ri, konsole:220380 (1;115;0)

I noticed a short delay whenever I ran a command. No matter what I do, I can never get it shorter than one 60Hz graphics 'frame' (~16ms). The time it takes from pressing enter, to running the command, is always the expected time, plus >16ms. It seems that the command is not executed until the display completes the update of the prompt.

I set up timestamping before and after the command, and it made this very visible:

[ 17:53:18.983 18/05 ]
pwd
/home/pallaswept
[ 17:53:19.001 + 0ms @ 100.0% CPU ]  pwd

~/                                           was 19ms                                                                                                                               
> 

The first time here ("...18.983ms 18/05...") is generated by ble.sh, using bleopt prompt_ps1_final:='\e[38;5;238m\n[ `eval $NOWDATE` ]\n' The second time here ("...19.001 + 0ms...") is generated by ble.sh, using blehook PREEXEC='bleopt exec_elapsed_mark:=$"\e[38;5;238m[ `eval $NOW` + \e[38;5;242m%s\e[38;5;238m @ \e[38;5;242m%s%%\e[38;5;238m CPU ] "' The last time here ("was 19ms") is generated by starship.

Note that the command's real execution time is 0ms. But it took 19ms. Any number less than 16.7 would have been great, because to my eyes, I wait one 'frame' any way. But the delay is always (real execution time)+(~16ms). So now, even the shortest command, takes 2 whole frames, or 33.333ms, to be visible. Now that I have my timestamping working, I can see that the delay is not during the execution of the command, but before the execution of the command.

To demonstrate, the delay is not caused by the eval commands:

[ 18:32:26.999 18/05 ]
time echo `eval $NOWDATE` `eval $NOW`
18:32:27.027 18/05 18:32:27.030

real    0m0.006s
user    0m0.004s
sys     0m0.003s
[ 18:32:27.018 + 6ms @ 100.0% CPU ]  time echo `eval $NOWDATE` `eval $NOW`

~/                                                                 was 28ms                                                                                                                                                                                                 
> 

Without the time and echo, it's 2-3ms for two eval date commands. So I am confident that is not the source of the delay. If I remove everything from the prompt and the pre/post-exec and all, It will always still take the 1 extra 'frame' of time.

I know these are very small numbers and I worry I am being too demanding, but then, a very short command takes 3300% of the normal time, so maybe it is not such a small number after all :D I hope it might be possible to begin command execution immediately upon pressing enter, and update the prompt asynchronously? Or maybe some other method, to avoid this delay before execution?

Thanks!

akinomyoga commented 1 month ago

In general, ble.sh needs to perform many things behind after receiving RET until it starts the execution of the command. Also, if you refer to the time Starship reports, it also contains the time of the command execution and subsequent other processing (including calling Starship to evaluate the next prompt). How much time ble.sh takes to process each of them depends on the machine and the detailed setup.

In addition to what ble.sh processes internally (such as the display of the last prompt, the command history management, and preparation of the context for the user's command), some external factors are the invocation of stty (which is needed to adjust the TTY state for the command execution) and the invocation of starship. Each may possibly take a longer time depending on the setup.

To measure the times, could you instead run the following set of commands? Please remove your hooks in prompt_ps1_final and PREEXEC.

$ rm -f measure.txt
$ blehook POSTEXEC='bleopt debug_xtrace='
$ bleopt debug_xtrace_ps4='+[$EPOCHREALTIME $BASH_SOURCE:$LINENO${FUNCNAME:+ (${#FUNCNAME[*]} $FUNCNAME)}] '
$ ble/widget/insert-string pwd; bleopt debug_xtrace=measure.txt    # <-- press [RET][RET] (i.e. twice [RET])
$ pwd    <-- This will be automatically shown. You don't have to input this.
$ sed -ni.bk '\:ble-decode/\.hook 13$:,$p' measure.txt

Then, can you attach the created file measure.txt?

pallaswept commented 1 month ago

Hi! Thanks for helping me with this!

I also thought it might be something to do with starship, or the other things I did with the prompt. For this test, I turned them all off.

measure.txt

That's a lot :laughing: I'm not 100% sure where the command really begins and ends. Would you be so kind as to help me decipher this?

akinomyoga commented 1 month ago

measure.txt

Thanks. The function-call tree in the reported measure.txt is as follows (where the calls with the time less than 0.5 msec are omitted). The first column shows the timestamp. The second column shows the time that the function call took. The third column shows the nest level. It should be noted that these time durations are different from the actual values when we don't log the timestamps. They contain extra time to save those timestamps.

In this setup, it takes 18.152 + 5.410 = 23.562 msec since ble.sh received RET until ble.sh is ready to execute the user's command. It should be noted that the time you got by bleopt prompt_final_ps1 is the time in the middle of the processing (which is processed between 1716108010.33{5995,6309} in the following case), so it doesn't include the time before bleopt prompt_final_ps1 is processed.

1716108010.330725  18.152ms  2  ble-decode/.hook 13 [(global):1]
1716108010.331328  16.625ms  3  +- ble/encoding:UTF-8/decode 13 [/usr/bin/blesh/ble.sh:63 (ble-decode/.hook)]
1716108010.331424  16.529ms  4  |  +- ble-decode-char 13 [/usr/bin/blesh/ble.sh:12 (ble/encoding:UTF-8/decode)]
1716108010.331872  15.965ms  5  |     +- ble/decode/send-unmodified-key 13 _13 [/usr/bin/blesh/ble.sh:73 (ble-decode-char)]
1716108010.332030  15.807ms  6  |        +- ble-decode-key 67108973 [/usr/bin/blesh/ble.sh:54 (ble/decode/send-unmodified-key)]
1716108010.332181  15.498ms  7  |           +- ble-decode/widget/.call-keyseq [/usr/bin/blesh/ble.sh:41 (ble-decode-key)]
1716108010.332814  14.618ms  8  |              +- builtin eval -- ble/widget/accept-line [/usr/bin/blesh/ble.sh:10 (ble-decode/widget/.call-keyseq)]
1716108010.332820  14.612ms  9  |                 +- ble/widget/accept-line [/usr/bin/blesh/ble.sh:1 (ble-decode/widget/.call-keyseq)]
1716108010.332829  14.603ms 10  |                    +- ble/decode/widget/keymap-dispatch [/usr/bin/blesh/ble.sh:2 (ble/widget/accept-line)]
1716108010.332885  14.547ms 11  |                       +- ble/widget/default/accept-line [/usr/bin/blesh/ble.sh:5 (ble/decode/widget/keymap-dispatch)]
1716108010.333035   1.251ms 12  |                          +- ble/edit/histexpand pwd [/usr/bin/blesh/ble.sh:51 (ble/widget/default/accept-line)]
1716108010.333055   1.218ms 13  |                          |  +- ble/util/assign ret ble/edit/histexpand/.impl [/usr/bin/blesh/ble.sh:6 (ble/edit/histexpand)]
1716108010.333100   1.004ms 14  |                          |     +- builtin eval -- ble/edit/histexpand/.impl [/usr/bin/blesh/ble.sh:3 (ble/util/assign)]
1716108010.333117   0.987ms 15  |                          |        +- ble/edit/histexpand/.impl [/usr/bin/blesh/ble.sh:1 (ble/util/assign)]
1716108010.333128   0.939ms 16  |                          |           +- ble/edit/histexpand/run [/usr/bin/blesh/ble.sh:2 (ble/edit/histexpand/.impl)]
1716108010.333159   0.886ms 17  |                          |              +- ble/builtin/history/option:p pwd [/usr/bin/blesh/ble.sh:4 (ble/edit/histexpand/run)]
1716108010.334471   1.084ms 12  |                          +- ble/history/add pwd [/usr/bin/blesh/ble.sh:86 (ble/widget/default/accept-line)]
1716108010.334579   0.976ms 13  |                          |  +- ble/history/.add-command-history pwd [/usr/bin/blesh/ble.sh:20 (ble/history/add)]
1716108010.334694   0.861ms 14  |                          |     +- ble/builtin/history/option:s pwd [/usr/bin/blesh/ble.sh:20 (ble/history/.add-command-history)]
1716108010.335573  11.797ms 12  |                          +- ble/widget/.newline [/usr/bin/blesh/ble.sh:89 (ble/widget/default/accept-line)]
1716108010.335608  10.999ms 13  |                             +- ble/widget/.insert-newline '' [/usr/bin/blesh/ble.sh:8 (ble/widget/.newline)]
1716108010.335929   9.805ms 14  |                             |  +- ble/textarea#render leave [/usr/bin/blesh/ble.sh:15 (ble/widget/.insert-newline)]
1716108010.335995   6.138ms 15  |                             |     +- ble/prompt/update check-dirty:leave [/usr/bin/blesh/ble.sh:6 (ble/textarea#render)]
1716108010.336309   5.693ms 16  |                             |     |  +- ble/prompt/unit#update _ble_prompt_ps1 [/usr/bin/blesh/ble.sh:59 (ble/prompt/update)]
1716108010.336491   5.389ms 17  |                             |     |     +- ble/prompt/unit:_ble_prompt_ps1/update _ble_prompt_ps1 [/usr/bin/blesh/ble.sh:18 (ble/prompt/unit#update)]
1716108010.336534   5.346ms 18  |                             |     |        +- 'ble/prompt/unit:{section}/update' _ble_prompt_ps1 '\[$(ppwd)\]\u@\h:\w> ' sh... [/usr/bin/blesh/ble.sh:3 (ble/prompt/unit:_ble_prompt_ps1/update)]
1716108010.336900   4.699ms 19  |                             |     |           +- ble/prompt/.instantiate '\[$(ppwd)\]\u@\h:\w> ' show-mode-in-prompt:measure-b... [/usr/bin/blesh/ble.sh:17 (ble/prompt/unit:{section}/update)]
1716108010.337183   3.643ms 20  |                             |     |              +- builtin eval 'expanded=${ps@P}' [/usr/bin/blesh/ble.sh:9 (ble/prompt/.instantiate)]
1716108010.337974   2.486ms 21  |                             |     |              |  +- ???
1716108010.337974   2.486ms 22  |                             |     |              |     +- ppwd [/usr/bin/blesh/ble.sh:9 (ble/prompt/.instantiate)]
1716108010.338340   1.424ms 23  |                             |     |              |        +- test -n '?7?]2;%s@%s:%s??]1;%s??8' [/etc/bash.bashrc:9 (ppwd)]
1716108010.339061   0.703ms 24  |                             |     |              |        |  +- dirs +0 [/etc/bash.bashrc:9 (ppwd)]
1716108010.339836   0.624ms 23  |                             |     |              |        +- printf '?7?]2;%s@%s:%s??]1;%s??8' pallaswept Pallas /tmp Pallas [/etc/bash.bashrc:9 (ppwd)]
1716108010.340840   0.556ms 20  |                             |     |              +- ble/util/rlvar#test show-mode-in-prompt [/usr/bin/blesh/ble.sh:31 (ble/prompt/.instantiate)]
1716108010.342964   1.784ms 15  |                             |     +- ble/textarea#update-text-buffer [/usr/bin/blesh/ble.sh:48 (ble/textarea#render)]
1716108010.343079   1.612ms 16  |                             |        +- ble/highlight/layer/update pwd '' -1 -1 -1 [/usr/bin/blesh/ble.sh:7 (ble/textarea#update-text-buffer)]
1716108010.343241   0.737ms 17  |                             |           +- ble/highlight/layer:syntax/update pwd plain [/usr/bin/blesh/ble.sh:11 (ble/highlight/layer/update)]
1716108010.346852   0.518ms 13  |                             +- ble/widget/.newline/clear-content [/usr/bin/blesh/ble.sh:12 (ble/widget/.newline)]
1716108010.348141   0.736ms  3  +- ble-decode/EPILOGUE [/usr/bin/blesh/ble.sh:69 (ble-decode/.hook)]
1716108010.348276   0.529ms  4     +- ble-edit/exec:gexec/process [/usr/bin/blesh/ble.sh:9 (ble-decode/EPILOGUE)]
1716108010.348284   0.516ms  5        +- ble-edit/exec:gexec/.setup [/usr/bin/blesh/ble.sh:2 (ble-edit/exec:gexec/process)]
1716108010.348877   5.410ms  2  builtin eval -- '_ble_decode_bind_hook= [(global):1]
1716108010.348888   4.310ms  3  +- ble-edit/exec:gexec/.begin [(global):2]
1716108010.348927   2.696ms  4  |  +- ble/term/leave [/usr/bin/blesh/ble.sh:6 (ble-edit/exec:gexec/.begin)]
1716108010.348940   1.653ms  5  |  |  +- ble/term/stty/leave [/usr/bin/blesh/ble.sh:3 (ble/term/leave)]
1716108010.348967   1.626ms  6  |  |  |  +- ble/bin/stty echo -nl icanon intr $'\003' quit $'\034' susp $'\032' [/usr/bin/blesh/ble.sh:7 (ble/term/stty/leave)]
1716108010.348977   1.616ms  7  |  |  |     +- /usr/bin/stty echo -nl icanon intr $'\003' quit $'\034' susp $'\032' [/usr/bin/blesh/ble.sh:1 (ble/bin/stty)]
1716108010.350636   0.962ms  5  |  |  +- ble/term/leave-for-widget [/usr/bin/blesh/ble.sh:5 (ble/term/leave)]
1716108010.350647   0.527ms  6  |  |     +- ble/term/visible-bell/erase [/usr/bin/blesh/ble.sh:2 (ble/term/leave-for-widget)]
1716108010.351773   0.587ms  4  |  +- ble/builtin/trap/install-hook INT [/usr/bin/blesh/ble.sh:9 (ble-edit/exec:gexec/.begin)]
1716108010.352360   0.754ms  4  |  +- blehook 'internal_INT!=ble-edit/exec:gexec/.TRAPINT' [/usr/bin/blesh/ble.sh:10 (ble-edit/exec:gexec/.begin)]
1716108010.352405   0.671ms  5  |     +- ble/base/.adjust-bash-options set shopt [/usr/bin/blesh/ble.sh:4 (blehook)]
1716108010.352449   0.591ms  6  |        +- ble/base/xtrace/adjust [/usr/bin/blesh/ble.sh:4 (ble/base/.adjust-bash-options)]
1716108010.352496   0.520ms  7  |           +- set +x [/usr/bin/blesh/ble.sh:8 (ble/base/xtrace/adjust)]
1716108010.353198   0.869ms  3  +- ble-edit/exec:gexec/.prologue pwd 6 [(global):3]

I notice that the evaluation of PS1 is performed after RET is received and before the command is executed. I think this is a regression by commit 44d9e104. The evaluation at this timing should only happen when PS1 contains a custom prompt segment that should be updated before the command execution. However, I confirmed that in the current master, PS1 is always updated when a command is added to the command history. This evaluation of PS1 before the command execution can be skipped. I'll take a look at this. However, this would reduce the total time by only about 5 ms by eliminating the following branch:

1716108010.336491   5.389ms 17  +- ble/prompt/unit:_ble_prompt_ps1/update _ble_prompt_ps1 [/usr/bin/blesh/ble.sh:18 (ble/prompt/unit#update)]

That's a lot 😆 I'm not 100% sure where the command really begins and ends.

These are the timing of the commands and functions internally processed between receiving RET and the command execution; the first line corresponds to the line executed by Bash to pass RET to ble.sh, and the last line is almost* at the end of the preparation of the command execution. After the last line in measure.txt, these lines and these lines are processed, and then the command is executed. Those remaining lines shouldn't take much time usually.


Would you be so kind as to help me decipher this?

You can check the relevant parts of Bash Reference Manual. ble.sh internally uses PS4 to set up the logging.

The number of + at the beginning shows the nest level of eval and subshells. The number 1716108010.xxxxxx shows the timestamp of the command. If you want to convert it to a human-friendly string, you can do

$ date -d '@1716108010.330998' +'%F %T.%N %Z'
2024-05-19 17:40:10.330998000 JST

The next file:lineno (funcname) shows the location of the command, and the rest part is the command.

akinomyoga commented 1 month ago

I know these are very small numbers and I worry I am being too demanding, but then, a very short command takes 3300% of the normal time, so maybe it is not such a small number after all :D

Those delays are related to updating the last prompt, registration of the command to the command history, and preparation of the command environment for the user's command. They are not something added to every call of a function or a command while executing the command line, and they are not something multiplied to the execution time of the command itself. Even if the total time is 3300% of a short command, it doesn't mean the delay becomes proportionally longer for a longer command. The delay is actually constant.

I hope it might be possible to begin command execution immediately upon pressing enter, and update the prompt asynchronously?

It's not possible unless we introduce a new model different from the conventional terminals, shells, and other commands. The whole ecosystem (terminals + shells) is required to be re-designed from scratch. The terminal system assumes that only one program is in front at the same time. There is only one cursor position in the terminal, so if two separate programs (ble.sh trying to update the prompt and the command you entered in the present case) try to write something to the terminal at the same time, the cursor position becomes unpredictable. Two outputs will be interleaved in the terminal display.

The only way is to sequence the updating of the prompt and the command execution in order.

Or one may implement a new type of terminal where shells and commands are run in separate pseudo-terminals, but it's a territory of the terminal but not of the shell.

Or maybe some other method, to avoid this delay before execution?

We may try to optimize the code, but I've already made much effort to make it short. As far as I can tell, there isn't much room to significantly improve the situation. Note: I don't have a plan to re-implement it in C or any other compiler language because that's incompatible with the original motivation of this project.

pallaswept commented 1 month ago

Thanks @akinomyoga you're very helpful :)

I kinda understand the bash side of this, but I understand enough to know that ble.sh is very complex and I think it would take me a long time to know it all.

The tree view you pasted, is very informative, though, helps me to understand the flow of execution, without having to understand the source code. Is there a way for me to generate this tree too? I think that, with that, I would be able to track down any potential sources for improvement, such as my prompt settings which take a long time.

The small regression you found might be very useful. ~5ms does not seem like a lot, but since we are at ~18ms, we only have to reduce by ~2ms to be less than one 60Hz frame (and so, practically invisible to the user). I suspect that without that issue, I may not have been able to even perceive any delay.

I feel like it is OK to close this issue now, as far as I am concerned there is no problem here and no real opportunity to optimise, and I am well informed about how to solve any problems I might create with my own settings. I will leave it open in case you want to track the regression you found, but please do feel free to close this whenever it is convenient for you. If I don't hear back from you, I will close it for you so that I don't leave a mess in your repo.

Thanks very much for this excellent project!

akinomyoga commented 1 month ago

The tree view you pasted, is very informative, though, helps me to understand the flow of execution, without having to understand the source code. Is there a way for me to generate this tree too?

Here's the AWK script I used to generate the tree in my previous reply. I'm not sure if it is actually useful for you because I wrote it on the fly and there might still be an edge case where it might fail. You can save the content as a file (named as e.g. ps4_tree.awk) with the executable permission and run ./ps4_tree.awk measure.txt to replicate the above tree. To change the threshold time for the commands, please adjust the value of THRESHOLD_DURATION. Setting it to 0.0 will include all the commands in the tree.

#!/usr/bin/gawk -f

BEGIN {
  # Only shows the levels that took more than this time.
  THRESHOLD_DURATION = 0.5; # [msec]
}

function s_trunc(str, len) {
  return length(str) > len ? substr(str, 1, len - 3) "..." : str;
}

function flush_cmd(level, now, _, dur_msec, prev_cmd, prev_source, prev_lineno, prev_func, line, child, i, n) {
  if (g_record[level] == "") return;

  start_time = g_record[level, "epoch"]
  end_time = now;
  dur_msec = (end_time - start_time) * 1000;

  prev_cmd = s_trunc(g_record[level], 80);
  gsub(/[\x00-\x1F]/ ,"?", prev_cmd);

  prev_source = g_record[level, "source"];
  prev_lineno = g_record[level, "lineno"];
  prev_func = g_record[level, "func"];
  if (prev_cmd == "???") {
    prev_source = "";
  } else if (prev_source == "" && prev_func == "") {
    prev_source = sprintf(" [(global):%d]", prev_lineno);
  } else {
    prev_source = sprintf(" [%s:%d (%s)]", prev_source, prev_lineno, prev_func);
  }

  n = 0 + g_record[level, "#child"];

  g_record[level] = "";
  g_record[level, "#child"] = 0;

  if (dur_msec < THRESHOLD_DURATION) return;

  line = sprintf("%17.6f %7.3fms %2d  __tree__\x1b[1m%s\x1b[;34m%s\x1b[m", start_time, dur_msec, level, prev_cmd, prev_source);
  for (i = 0; i < n; i++) {
    child = g_record[level, "child", i];
    gsub(/__tree__/, i < n - 1 ? "&|  " : "&   ", child);
    sub(/__tree__.../, "__tree__+- ", child);
    line = line "\n" child;
  }

  if (level > g_min_level) {
    if (g_record[level - 1] == "") {
      g_record[level - 1] = "???";
      g_record[level - 1, "epoch"] = start_time;
      g_record[level - 1, "#child"] = 0;
    }
    i = 0 + g_record[level - 1, "#child"];
    g_record[level - 1, "child", i] = line;
    g_record[level - 1, "#child"] = i + 1;
  } else {
    gsub(/__tree__/, "", line);
    print line;
  }
}

function flush_level(level, now) {
  for (; g_level >= level; g_level--)
    flush_cmd(g_level, now);
  g_level = level;
}

match($0, /^(\+*)\[([0-9.]+) ([^:[:space:]]*):([0-9]*)( \(([0-9]+) ([^()]*)\))?\] (.*)$/, m) {
  eval_level = m[1];
  epoch = 0.0 + m[2];
  source = m[3];
  lineno = m[4];
  func_level = 0 + m[6];
  func_name = m[7];
  cmd = m[8];

  level = length(eval_level) + func_level;
  flush_level(level, epoch);

  if (g_min_level == "" || g_min_level > level)
    g_min_level = level;

  g_record[level] = cmd;
  g_record[level, "epoch"] = epoch;
  g_record[level, "source"] = source;
  g_record[level, "lineno"] = lineno;
  g_record[level, "func"] = func_name;
}

END { flush_level(1, epoch); }
akinomyoga commented 1 month ago

The small regression you found might be very useful.

I'll anyway fix the regression regardless of whether it's useful for performance improvement.

~5ms does not seem like a lot, but since we are at ~18ms, we only have to reduce by ~2ms to be less than one 60Hz frame

I actually doubt that "18ms" is relevant to you. First, the time between the reception of RET and the command execution is about 23 msec but not ~18 msec (which you measured using prompt_ps1_final and blehook PREEXEC). Next, even that "23 msec" doesn't include the time in which the signal is transmitted from your keyboard to the operating system, the window manager, the terminal emulator (plus any SSH or terminal multiplexers sitting in-between), Bash, and finally to ble.sh.

Here's the diagram of the important events in the provided measure.txt plus additional processes before ble.sh receives RET and after the user command outputs a string to stdout.

*0 Your brain
-> *1 your finger -> keyboard    # should take at least ~50 ms; cf (reflex of top athletes ~ 120 ms)/2 = 60
-> *2 device driver -> window manager -> terminal -> TTY -> Bash/Readline    # not sure but probably < 10 msec?
-> *3 1716108010.330725 ble-decode/.hook
-> *4 1716108010.337183 builtin eval 'expanded=${ps@P}'         # 6.4 msec since *3
-> *5 1716108010.348324 ble/util/buffer.flush                   # 11.2 msec since *4
-> *6 1716108010.348877 builtin eval -- '_ble_decode_bind_hook= # 0.5 msec since *5
-> *7 1716108010.353870 blehook/invoke PREEXEC pwd              # 5.0 msec since *6
-> *8 1716108010.354287 set +x                                  # 0.4 msec since *5
-> *9 command execution                                         # hopefully < a few msec?
-> *10 stdout -> TTY -> terminal (-> window manager) -> video driver
-> *11 display -> light -> your eye -> your brain

What you've initially measured using bleopt prompt_ps1_final and blehook PREEXEC is the time between *4 and *7 (= 11.2 + 0.5 + 5.0 = 16.7 msec). The 18 msec seen in https://github.com/akinomyoga/ble.sh/issues/447#issuecomment-2119630340 is the time between *3 and *6 (6.4 + 11.2 + 0.5 = 18.1 msec). The time 5.410 msec in https://github.com/akinomyoga/ble.sh/issues/447#issuecomment-2119630340 is the time between *6 and *8 (5.0 + 0.4 = 5.4 msec). Even if 5.38 msec is reduced, the whole process is still at least 23.5 - 5.4 = 18.1 msec for *3..*8 plus additional time for *0..*3 and *8..*11.

If you are interested in the delay between the last prompt rendering and the command execution, *5 is the command that sends the prompt rendering data to the terminal. In this case, the time difference 5.0 + 0.4 + a few msec is already less than 10 msec so should fit in a frame at about half probability in 60 fps. If they are never rendered in the same frame, you also need to suspect the terminal's implementation for how your terminal renders the data received in a separate packet from the TTY.

I feel like it is OK to close this issue now, as far as I am concerned there is no problem here and no real opportunity to optimise, and I am well informed about how to solve any problems I might create with my own settings. I will leave it open in case you want to track the regression you found,

Let me keep it open until I fix the mentioned regression.

Thanks very much for this excellent project!

Thanks for trying ble.sh!

akinomyoga commented 1 month ago

Let me keep it open until I fix the mentioned regression.

I pushed a fix 911a405193880bb3c1b1e9a22acaaf295ce86499 to the master branch. Could you update ble.sh by running ble-update in a ble.sh session?

pallaswept commented 1 month ago

Thanks! It looks like this now:

measure.txt

Your awk script is very nice and even nicer when ble.sh does fancy automatic colours in my terminal :D

Edit: also, the timestamping I was using, is more accurate now - some simple examples:

[ 09:22:40.110 23/05 ]
pwd
/home/pallaswept
[ 09:22:40.130 + 0ms @ 100.0% CPU ]  pwd

~/                                            was 20ms                                                                                                                                  
[ 09:37:23.243 23/05 ]
ls

... (directory listing removed) ...

[ 09:37:23.263 + 7ms @ 136.3% CPU ]  ls

~/                                            was 27ms 

Nice!

akinomyoga commented 3 weeks ago

FYI, I integrated the script to generate the function-call tree in ble/debug/profiler in commit a453f37309c9b3784a9d9bc71efe9d789f06009b.

The profiler can be activated by

$ ble/debug/profiler/start NAME

and then terminated by

$ ble/debug/profiler/end

The results are saved in NAME.line.txt and NAME.func.txt. This was an existing feature.

This time, I added a new analysis of the function-call tree to ble/debug/profiler. When one includes the value tree in bleopt debug_profiler_opts, an additional analysis for the function-call tree is saved in NAME.tree.txt. The threshold time is specified by bleopt debug_profiler_tree_threshold.

pallaswept commented 3 weeks ago

That is just great. I liked this function so much, that I had saved a copy for future use - now I don't need that, because it's built-in :smile: