akinomyoga / ble.sh

Bash Line Editor―a line editor written in pure Bash with syntax highlighting, auto suggestions, vim modes, etc. for Bash interactive sessions.
BSD 3-Clause "New" or "Revised" License
2.47k stars 80 forks source link

macOS: no `cat -A` and infinite Loops #131

Open killermoehre opened 3 years ago

killermoehre commented 3 years ago

ble version: ble.sh (Bash Line Editor), version 0.4.0-devel3+a860769 Bash version: 5.1.8(1)-release (x86_64-apple-darwin20.3.0)

So, on macOS cat does not have the -A option. This is probably why I see processes going up to 100 % CPU. Or it's a red herring.

Maybe you find other problems. Nevertheless, ble.sh right now is super slow on macOS. I hope I can help you to fix this. I see in htop bash processes as sister to some gawk, running amok with some neverending loop.

$ LANG=en_US.UTF-8 . ./ble.sh --test
MACHTYPE: x86_64-apple-darwin20.3.0
BLE_VERSION: 0.4.0-devel3+a860769
BASH_VERSION: 5.1.8(1)-release
~/.local/share/blesh/./lib/test-main.sh:33:
    ble/util/readlink f.txt
    [[ $ret != /* ]] && ret=${PWD%/}/$ret
--- 49337.ret.expect    2021-08-03 16:12:18.000000000 +0200
+++ 49337.ret.result    2021-08-03 16:12:18.000000000 +0200
@@ -1 +1 @@
-/tmp/blesh/501/47495.test/49337.d/ab/cd/ef/file.txt
+/tmp/blesh/501/47495.test/49337.d/link.d/cd/ef/file.txt

[section] main: 15/16 (1 fail, 0 crash, 0 skip)
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8): No such file or directory
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8): No such file or directory
~/.local/share/blesh/./lib/test-util.sh:1186: [[ $BASHPID == $ppid ]]
--- 49459.exit.expect   2021-08-03 16:12:22.000000000 +0200
+++ 49459.exit.result   2021-08-03 16:12:22.000000000 +0200
@@ -1 +1 @@
-0
+1

~/.local/share/blesh/./lib/test-util.sh:1193: [[ $BASHPID == $ppid ]]
--- 49464.exit.expect   2021-08-03 16:12:22.000000000 +0200
+++ 49464.exit.result   2021-08-03 16:12:22.000000000 +0200
@@ -1 +1 @@
-0
+1

~/.local/share/blesh/./lib/test-util.sh:1311: printf a | ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1 +1 @@
-a
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

~/.local/share/blesh/./lib/test-util.sh:1312: printf '\0' | ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1 +1 @@
-^@
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

~/.local/share/blesh/./lib/test-util.sh:1313: printf 'hello\nworld\n'| ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1,2 +1 @@
-hello$
-world$
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

~/.local/share/blesh/./lib/test-util.sh:1314: printf 'hello\nworld'| ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1,2 +1 @@
-hello$
-world
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

~/.local/share/blesh/./lib/test-util.sh:1315: printf 'hello\0world\0'| ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1 +1 @@
-hello^@world^@
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

~/.local/share/blesh/./lib/test-util.sh:1316: printf 'hello\0world'| ble/util/cat | cat -A
--- 49533.stdout.expect 2021-08-03 16:12:23.000000000 +0200
+++ 49533.stdout.result 2021-08-03 16:12:23.000000000 +0200
@@ -1 +1 @@
-hello^@world
+
<STDERR>
cat: illegal option -- A
usage: cat [-benstuv] [file ...]
</STDERR>

[section] util: 1184/1192 (8 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (relative:confine:measure-bbox): 5/5 (0 fail, 0 crash, 12 skip)
[section] ble/canvas/trace (cfuncs): 17/17 (0 fail, 0 crash, 1 skip)
[section] ble/canvas/trace (justify): 2/2 (0 fail, 0 crash, 22 skip)
[section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
killermoehre commented 3 years ago

Oh, and on macOS, the locale is called en_US.UTF-8, not en_US.utf8.

akinomyoga commented 3 years ago

Thanks for the report. Do you experience infinite loops in the tests or outside the ble tests? cat -A is only used in tests so infinite loops or 100% CPUs outside the tests shouldn't be related to cat -A. Actually, tests are primarily for the development in my environment and not carefully designed to work in every environment.

Oh, and on macOS, the locale is called en_US.UTF-8, not en_US.utf8.

Hmm, I don't remember well but there has been an environment where en_US.UTF-8 doesn't work, which was the reason why en_US.utf8 is used instead of en_US.UTF-8. Also, in GNU/Linux people use en_US.UTF-8 but en_US.utf8 is usually printed as the default locale name by locale -a.

Now maybe I need to introduce the environment-dependent tests.

killermoehre commented 3 years ago

I understand that cat -A is only used for test. Is there some way to profile the called scripts to check why multiple cores get used 100% by bash processes?

Also, the

~/.local/share/blesh/./lib/test-util.sh:1186: [[ $BASHPID == $ppid ]]
--- 49459.exit.expect   2021-08-03 16:12:22.000000000 +0200
+++ 49459.exit.result   2021-08-03 16:12:22.000000000 +0200
@@ -1 +1 @@
-0
+1

~/.local/share/blesh/./lib/test-util.sh:1193: [[ $BASHPID == $ppid ]]
--- 49464.exit.expect   2021-08-03 16:12:22.000000000 +0200
+++ 49464.exit.result   2021-08-03 16:12:22.000000000 +0200
@@ -1 +1 @@
-0
+1

looks wrong. Maybe by not finding the proper parent PID some processes run into a loop?

akinomyoga commented 3 years ago

Hmm, these tests also need to be fixed, but this shouldn't be related to the infinite loop in your environment because these are the tests for the shell function ble/util/getpid which is only used in Bash 3.x. Actually, ble/util/getpid is used for the polyfill for the shell special variable BASHPID. It doesn't do anything in Bash 4+ because Bash 4+ already has the support of BASHPID.

killermoehre commented 3 years ago

OK. Any way to profile/debug such loops?

akinomyoga commented 3 years ago

I see in htop bash processes as sister to some gawk, running amok with some neverending loop.

Sorry for the long blank. Can you get the command-line argument of the gawk process? It is useful information to identify the related section in the codebase.

killermoehre commented 3 years ago

Sorry for the delay. Here is some pstree.

 |-+- 28529 killermoehre -bash
 | \-+- 28530 killermoehre -bash
 |   |--- 28531 killermoehre -bash
 |   \-+- 28533 killermoehre -bash
 |     \--- 28535 killermoehre /usr/local/bin/gawk -v AWKTYPE=gawk -v apos=' -v _ble_bash=50108 \012      BEGIN {\012        q = apos;\012        Q = apos "\\" apos apos;\012        reason = ENVIRON["reason"];\012        is_resolve = reason == "resolve";\012        TMPBASE = ENVIRON["TMPBASE"];\012        filename_source = TMPBASE ".part";\012        if (is_resolve)\012          print "builtin history -c" > filename_source\012        entry_nline = 0;\012        entry_text = "";\012        entry_time = "";\012        if (_ble_bash >= 40400)\012          entry_time = ENVIRON["epoch"];\012        command_count = 0;\012        multiline_count = 0;\012        modification_count = 0;\012        read_section_count = 0;\012      }\012      function write_flush(_, i, filename_section, t, c) {\012        if (command_count == 0) return;\012        if (command_count >= 2 || entry_time) {\012          filename_section = TMPBASE "." read_section_count++ ".part";\012          for (i = 0; i < command_count; i++) {\012            t = command_time[i];\012            c = command_text[i];\012            if (t) print "#" t > filename_section;\012            print c > filename_section;\012          }\012          print "HISTTIMEFORMAT=%s builtin history -r " filename_section > filename_source;\012        } else {\012          for (i = 0; i < command_count; i++) {\012            c = command_text[i];\012            gsub(/'/, Q, c);\012            print "builtin history -s -- " q c q > filename_source;\012          }\012        }\012        command_count = 0;\012      }\012      function write_complex(value) {\012        write_flush();\012        print "builtin history -s -- " value > filename_source;\012      }\012      function register_command(cmd) {\012        command_time[command_count] = entry_time;\012        command_text[command_count] = cmd;\012        command_count++;\012      }\012      function is_escaped_command(cmd) {\012        return cmd ~ /^eval -- \$'([^'\\]|\\[\\'nt])*'$/;\012      }\012      function unescape_command(cmd) {\012        cmd = substr(cmd, 11, length(cmd) - 11);\012        gsub(/\\\\/, "\\q", cmd);\012        gsub(/\\n/, "\n", cmd);\012        gsub(/\\t/, "\t", cmd);\012        gsub(/\\'/, "'", cmd);\012        gsub(/\\q/, "\\", cmd);\012        return cmd;\012      }\012      function register_escaped_command(cmd) {\012        multiline_count++;\012        modification_count++;\012        if (_ble_bash >= 40400) {\012          register_command(unescape_command(cmd));\012        } else {\012          write_complex(substr(cmd, 9));\012        }\012      }\012      function register_multiline_command(cmd) {\012        multiline_count++;\012        if (_ble_bash >= 40040) {\012          register_command(cmd);\012        } else {\012          gsub(/'/, Q, cmd);\012          write_complex(q cmd q);\012        }\012      }\012      function flush_entry() {\012        if (entry_nline < 1) return;\012        if (is_escaped_command(entry_text)) {\012          register_escaped_command(entry_text)\012        } else if (entry_nline > 1) {\012          register_multiline_command(entry_text);\012        } else {\012          register_command(entry_text);\012        }\012        entry_nline = 0;\012        entry_text = "";\012      }\012      function save_timestamp(line) {\012        if (is_resolve) {\012          if (line ~ /^ *[0-9]+\*? +__ble_time_[0-9]+__/) {\012            sub(/^ *[0-9]+\*? +__ble_time_/, "", line);\012            sub(/__.*$/, "", line);\012            entry_time = line;\012          }\012        } else {\012          if (line ~ /^#[0-9]/) {\012            sub(/^#/, "", line);\012            sub(/[^0-9].*$/, "", line);\012            entry_time = line;\012          }\012        }\012      }\012      {\012        if (is_resolve) {\012          save_timestamp($0);\012          if (sub(/^ *[0-9]+\*? +(__ble_time_[0-9]+__|\?\?)/, "", $0))\012            flush_entry();\012          entry_text = ++entry_nline == 1 ? $0 : entry_text "\n" $0;\012        } else {\012          if ($0 ~ /^#[0-9]/) {\012            save_timestamp($0);\012            next;\012          } else {\012            flush_entry();\012            entry_text = $0;\012            entry_nline = 1;\012          }\012        }\012      }\012      END {\012        flush_entry();\012        write_flush();\012        if (is_resolve)\012          print "builtin history -a /dev/null" > filename_source\012        print "multiline_count=" multiline_count;\012        print "modification_count=" modification_count;\012      }\012
 \-+- 28534 killermoehre -bash
   |--- 28536 killermoehre -bash
   \-+- 28537 killermoehre -bash
     \--- 28538 killermoehre /usr/local/bin/gawk -v AWKTYPE=gawk -v apos=' -v arg_offset=0 -v _ble_bash=50108 \012      function s2i_initialize() {\012        for (i = 0; i < 16; i++)\012          xdigit2int[sprintf("%x", i)] = i;\012        for (i = 10; i < 16; i++)\012          xdigit2int[sprintf("%X", i)] = i;\012      }\012      function s2i(s, base, _, i, n, r) {\012        if (!base) base = 10;\012        r = 0;\012        n = length(s);\012        for (i = 1; i <= n; i++)\012          r = r * base + xdigit2int[substr(s, i, 1)];\012        return r;\012      }\012      function c2s_initialize(_, i) {\012        if (sprintf("%c", 945) == "α") {\012          C2S_UNICODE_PRINTF_C = 1;\012        } else {\012          C2S_UNICODE_PRINTF_C = 0;\012          for (i = 1; i <= 255; i++)\012            c2s_byte2char[i] = sprintf("%c", i);\012        }\012      }\012      function c2s(code, _, leadbyte_mark, leadbyte_sup, tail) {\012        if (C2S_UNICODE_PRINTF_C)\012          return sprintf("%c", code);\012        leadbyte_sup = 128; # 0x80\012        leadbyte_mark = 0;\012        tail = "";\012        while (leadbyte_sup && code >= leadbyte_sup) {\012          leadbyte_sup /= 2;\012          leadbyte_mark = leadbyte_mark ? leadbyte_mark / 2 : 65472; # 0xFFC0\012          tail = c2s_byte2char[128 + int(code % 64)] tail;\012          code = int(code / 64);\012        }\012        return c2s_byte2char[(leadbyte_mark + code) % 256] tail;\012      }\012      function es_initialize(_, c) {\012        es_control_chars["a"] = "\a";\012        es_control_chars["b"] = "\b";\012        es_control_chars["t"] = "\t";\012        es_control_chars["n"] = "\n";\012        es_control_chars["v"] = "\v";\012        es_control_chars["f"] = "\f";\012        es_control_chars["r"] = "\r";\012        es_control_chars["e"] = "\033";\012        es_control_chars["E"] = "\033";\012        es_control_chars["?"] = "?";\012        es_control_chars[apos] = apos;\012        es_control_chars["\""] = "\"";\012        es_control_chars["\\"] = "\\";\012        for (c = 32; c < 127; c++)\012          es_s2c[sprintf("%c", c)] = c;\012      }\012      function es_unescape(s, _, head, c) {\012        head = "";\012        while (match(s, /^[^\\]*\\/)) {\012          head = head substr(s, 1, RLENGTH - 1);\012          s = substr(s, RLENGTH + 1);\012          if ((c = es_control_chars[substr(s, 1, 1)])) {\012            head = head c;\012            s = substr(s, 2);\012          } else if (match(s, /^[0-9]([0-9][0-9]?)?/)) {\012            head = head c2s(s2i(substr(s, 1, RLENGTH), 8) % 256);\012            s = substr(s, RLENGTH + 1);\012          } else if (match(s, /^x[0-9a-fA-F][0-9a-fA-F]?/)) {\012            head = head c2s(s2i(substr(s, 2, RLENGTH - 1), 16));\012            s = substr(s, RLENGTH + 1);\012          } else if (match(s, /^U[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]([0-9a-fA-F]([0-9a-fA-F][0-9a-fA-F]?)?)?/)) {\012            head = head c2s(s2i(substr(s, 2, RLENGTH - 1), 16));\012            s = substr(s, RLENGTH + 1);\012          } else if (match(s, /^[uU][0-9a-fA-F]([0-9a-fA-F]([0-9a-fA-F][0-9a-fA-F]?)?)?/)) {\012            head = head c2s(s2i(substr(s, 2, RLENGTH - 1), 16));\012            s = substr(s, RLENGTH + 1);\012          } else if (match(s, /^c[ -~]/)) {\012            c = es_s2c[substr(s, 2, 1)];\012            head = head c2s(_ble_bash >= 40400 && c == 63 ? 127 : c % 32);\012            s = substr(s, 3);\012          } else {\012            head = head "\\";\012          }\012        }\012        return head s;\012      }\012      BEGIN {\012        s2i_initialize();\012        c2s_initialize();\012        es_initialize();\012        INDEX_FILE = ENVIRON["INDEX_FILE"];\012        opt_null = ENVIRON["opt_null"];\012        opt_source = ENVIRON["opt_source"];\012        if (!opt_null && !opt_source)\012          printf("") > INDEX_FILE; # create file\012        n = 0;\012        hindex = arg_offset;\012      }\012      function flush_line() {\012        if (n < 1) return;\012        if (opt_null) {\012          if (t ~ /^eval -- \$'([^'\\]|\\.)*'$/)\012            t = es_unescape(substr(t, 11, length(t) - 11));\012          printf("%s%c", t, 0);\012        } else if (opt_source) {\012          if (t ~ /^eval -- \$'([^'\\]|\\.)*'$/)\012            t = es_unescape(substr(t, 11, length(t) - 11));\012          gsub(/'/, "'\\''", t);\012          print "_ble_history[" hindex "]=" apos t apos;\012        } else {\012          if (n == 1) {\012            if (t ~ /^eval -- \$'([^'\\]|\\.)*'$/)\012              print hindex > INDEX_FILE;\012          } else {\012            gsub(/['\\]/, "\\\\&", t);\012            gsub(/\n/, "\\n", t);\012            print hindex > INDEX_FILE;\012            t = "eval -- $" apos t apos;\012          }\012          print t;\012        }\012        hindex++;\012        n = 0;\012        t = "";\012      }\012      {\012        if (sub(/^ *[0-9]+\*? +(__ble_ext__|\?\?)/, "", $0))\012          flush_line();\012        t = ++n == 1 ? $0 : t "\n" $0;\012      }\012      END { flush_line(); }\012
akinomyoga commented 3 years ago

@killermoehre Thank you for the information! This is very useful hint to narrow down the problem! The first gawk is called from the shell function ble/history:bash/resolve-multiline/.awk (defined in src/history.sh) and the second gawk is called from the shell function ble/history:bash/load/.background-initialize (also defined in src/history.sh). Both are related to the initialization of the command history. I have additional questions:

$ wc ~/.bash_history   # Note: the number of lines in your command-history file
$ builtin history | wc
$ time ble/util/msleep 100
killermoehre commented 3 years ago
  • Q1: In the pstree result that you have provided, there are two gawk instances. Which one is related to the Bash of 100% CPU usage? Or both gawks have accompanying Bash processes of 100% CPU usage? You have wrote

    I see in htop bash processes as sister to some gawk, running amok with some neverending loop.

Both gawk processes have a 100% CPU process associated.

Can you tell me which bash in the above pstree result uses 100% CPU? For example, are they PID 28531 and 28536?

28531 and 28536 are the 100% CPU bash processes.

  • Q2: These processes are supposed to be performed when ble.sh session starts. Does the 100% CPU usage happens only when the ble.sh session started? Or the CPU usage is always 100% even after a long time since the ble.sh session has started? If the CPU usage goes down after a while, how long does it take?

The 100% happens if I try to get the last entry from the history after I start a new shell.

  • Q3: Does this 100% CPU usage always happens when ble.sh session has started? Or does it occur at a certain probability?

Always.

  • Q4: While Bash occupies 100% CPU, does the corresponding interactive shell respond? If it responds, how quick is the response? Is it normal or slower than usual?

It does not response. It hangs at loading history ....

  • Q5: When Bash occupies 100% CPU, how is the CPU usage of gawk processes? Do they also occupy a large fraction of CPU resources?

No. Just around 5%.

  • Q6: What is the result of the following command?
$ wc ~/.bash_history   # Note: the number of lines in your command-history file
$ builtin history | wc
$ wc ~/.bash_history
  549711 2631751 25820094 /Users/c5219583/.bash_history
$ builtin history | wc
  549712 3181467 30117812
  • Q7: What is the result of the following command?
$ time ble/util/msleep 100
$ time ble/util/msleep 100

real    0m0,104s
user    0m0,000s
sys 0m0,000s

Additional, after entering a command and hitting Return, it takes ages (several seconds) until the command is actually executed. Seems like something with the history loading? In the process description of the terminal title there is some gawk process as well.

akinomyoga commented 3 years ago

Thank you! Hmm, OK. Maybe that's because your history size is large. But I still see some behavior that I don't have an idea why it behaves in such a way.

It does not response. It hangs at loading history ....

$ cd
$ cp .bash_history .bash_history.backup20210825 # Please make a backup! Please restore it after testing.
$ tail -10000 .bash_history.backup20210825 > .bash_history  # Smaller size of history with 10k entries
$ bash   # new session

Additional, after entering a command and hitting Return, it takes ages (several seconds) until the command is actually executed.

I guess this is another bottleneck that is related to the large history.

killermoehre commented 3 years ago
  • Q8: When it hangs, is the message loading history... always shown?

No, only when I try to access the history by using arrow_up to get the last command. If I execute a command from the start of my history, it just hangs.

  • Q9: If you reduces the size of the history, does the hanging time reduce as well? For example,
$ cd
$ cp .bash_history .bash_history.backup20210825 # Please make a backup! Please restore it after testing.
$ tail -10000 .bash_history.backup20210825 > .bash_history  # Smaller size of history with 10k entries
$ bash   # new session

Yes, it is fast.

  • Q10: Do you have erasedups in your $HISTCONTROL? If it contains erasedups, can you check if the above delay before the command execution goes away when you removed erasedups from HISTCONTROL?
$ echo $HISTCONTROL
ignoreboth:erasedups

If I remove erasedups everything is swooping and fast with the small history.

Here is the speed matrix:

large history small history
 w/ erasedups slow slow
w/o erasedups slow fast

For the time being I will set HISTSIZE and HISTCONTROL accordingly.

Here is the content of the file setting the HISTSIZE

declare -x -i HISTSIZE=$((1024 * 1024 * 5))
declare -x -i HISTFILESIZE="$HISTSIZE"
akinomyoga commented 3 years ago

Thank you for the information.

No, only when I try to access the history by using arrow_up to get the last command. If I execute a command from the start of my history, it just hangs.

I'm confused. What does "hangs" exactly mean? Does it mean it never responded no matter how long you waited (as far as you have tested)? Or does it just mean it takes some long time but you have observed the end of the processing?

I also have tested with a large history in my Linux. With the history size ~ 5M, it becomes slow but still doesn't hang. When I increased the size to about 10M, Bash run out the memory.

Can you tell me which bash in the above pstree result uses 100% CPU? For example, are they PID 28531 and 28536?

28531 and 28536 are the 100% CPU bash processes.

I checked them in my Linux, and the Bash processes in that position of the process tree actually just execute the command builtin history. If the above information is correct, it means that builtin history takes somehow long time.

$ time builtin history >/dev/null
$ time builtin history > tmpfile
$ time builtin history | ble/bin/awk '{}'

What is the result in the plain Bash (without ble.sh loaded) (with the large history)?

$ bash --norc # plain Bash
$ time history > /dev/null
$ time history > tmpfile
$ func() { gawk '{}'; }; time history | func
$ builtin history | ble/bin/sleep 60

Here is the speed matrix:

large history small history
 w/ erasedups slow slow
w/o erasedups slow fast

What are these slow and fast measure? I thought only the large history with erasedups should be slow, but it says slow even with small history with erasedups. Since you have reported that it is several seconds with large history and erasedups ("it takes ages (several seconds) until the command is actually executed."), I expect it will become less than 10/50 sec = 0.2 sec with small history of the 1/50 size. What do these slows actually mean?