akinomyoga / ble.sh

Bash Line Editor―a line editor written in pure Bash with syntax highlighting, auto suggestions, vim modes, etc. for Bash interactive sessions.
BSD 3-Clause "New" or "Revised" License
2.46k stars 80 forks source link

Memory footprint investigation #214

Open ghost opened 2 years ago

ghost commented 2 years ago

Not a bug / Just some memory requirement investigation, for a single process

bash --norc              0.4M
bash                     1.4M
zsh + syntax-highlight   1.5M

bash 5.1.16 + ble 0.3.3  15 M
Konsole (GUI terminal)   23 M
bash 5.1.16 + ble 0.4.0  24 M

Ble 0.3.3 then requires 10x more memory than zsh (which is a complete shell), and ble 0.4.0, 16x more than zsh. Just wonder if this is nominal, or some hidden activities (debug or else) could be disabled to be closer to zsh in term of memory requirement.

akinomyoga commented 2 years ago

Yes, actually this is one of the recent big problems. Currently I don't have any idea how to reduce the memory use. What I obtained in my previous investigation is this:

bash-3.2..4.2 + ble.sh ~ 16MB
bash-4.3..5.0 + ble.sh ~ 26MB
bash-5.1      + ble.sh ~ 33MB

From these observations, my naive guess at that time was the binary representation of functions in Bash process consumes the memory, but I haven't further tracked down the actual causes.

Edit: In any case, I don't think it is possible to make the footprint the same order as zsh and plain bash as far as we implement it in a shell script.

ghost commented 2 years ago

Anyway, after reading Don’t EverUse Zsh , I am not motivated to switch back to zsh.

As it is quite difficult to track which functionalities have progressively required so more memory over the time, maybe the relevant question is now: -> when stabilized, could ble.sh be one day rewritten in C, like a bash-completion module, and propose features to be loaded as modules (like zsh) from bare minimum syntax highlight (my very limited requirement), up to more complex features. Really don't know / just thinking loudly.

Not an issue for me because in the meantime, due to kate/dolphin konsole issues (ble.sh stuck in multiline mode when starting / { cd ... ; clear } commands), I only start ble.sh for konsole with this workaround in .bashrc, and then use only one ble.sh process.

[[ ${BLE_VERSION-} && ! $(ps ax |tail |grep '[b]in/kate\|[b]in/dolphin') ]] && ble-attach

akinomyoga commented 2 years ago

-> when stabilized, could ble.sh be one day rewritten in C,

Of course, if you are willing to work on that, I won't stop you, but at least I will not work on it. If I were to abandon ble.sh and write a line editor in C someday, I would write a line editor that has a completely different feature set than ble.sh (because the current feature set is strongly affected by the language itself), but that is unlikely to happen because I'm satisfied and comfortable with the current feature set of ble.sh.

like a bash-completion module,

bash-completion has never been translated into C. It's been written in Bash from the beginning and is still written in Bash now.

ghost commented 2 years ago

bash-completion has never been translated into C

You are right, confusion with bash source. I then think this thread can be closed, unless you decide not.

akinomyoga commented 2 years ago

Let me keep this open for a while. Maybe later I'll investigate why the memory use is increasing in more recent versions of Bash.

ghost commented 2 years ago

Please also keep in mind that in my first message, same bash 5.1.16 was used to compare ble.sh 0.3.3 and 0.4.0 showing a big memory increase. I currently use 0.3.3+7153250.

akinomyoga commented 2 years ago

I know that. It's just because the feature set of 0.4 is significantly larger than 0.3.

ghost commented 1 year ago

Afaik my basic understanding tells that loops can the main cause. They're the ones that always hold back execution timings. Do you have any loop statements btw ? If so, try alternative data methods such as implementing binary tree structure, etc. I'm not a programmer btw, just asking whether my analogy makes sense...

akinomyoga commented 1 year ago

Afaik my basic understanding tells that loops can the main cause. They're the ones that always hold back execution timings.

"The main cause of what" are you talking about? This issue is for the large memory footprint and is unrelated to the execution times.

Do you have any loop statements btw ?

Yes, definitely.

If so, try alternative data methods such as implementing binary tree structure, etc.

I'm still not sure what are you talking about. The binary tree structures can be used e.g. to reduce the time complexity of searching, random insertion, etc., but it worsens the problem discussed here, i.e., it increases the memory footprint. Also, modern architectures of CPUs and memories rely on the locality of references for caching, so binary-tree structures typically cannot compete with flat arrays in typical realistic sizes of the data even though theoretically binary tree scales better than a flat array with very large data sizes. In particular, the speed becomes unendurably slow if the binary trees are implemented in Bash script instead of being natively implemented at the C level.

Maybe you are talking about loop vs recursion. In that case, I must say that the loop is generally more efficient and free from the tight limitation by the call stack sizes, though the recursion with recursive data structure can be implemented easily in a clean and plain way. In any way, loop and recursion are logically identical and are just different representations of the same logic, i.e., both can be converted to each other without changing the time complexity. Rewriting loops in recursions do not solve any problem, but just introduces the limitation by the call stack sizes.

I'm not a programmer btw, just asking whether my analogy makes sense...

Currently, it doesn't seem to make sense to me.

ghost commented 1 year ago

Understood. Tq for explanation. I'm just asking through guess talk, so don't mind me anyways. I was confused by the issue heading btw. I thought memory scaling & time complexity are similar. But I gotta say, ble.sh project also kinda feels kinda slow in lot of interactive moments (which is an another issue).

akinomyoga commented 1 year ago

Time complexity and space complexity are typically anti-correlated. One can improve the time complexity by sacrificing the space complexity, or vice versa.

But I gotta say, ble.sh project also kinda feels kinda slow in lot of interactive moments (which is an another issue).

This is just because it is implemented in a Bash script. Bash programs are just slow compared to the programs written in C and compiled into machine codes.

Also, it seems to depend on the operating system. I often receive performance issues from macOS users and WSL users, but I have never received performance issues with Linux except for unusual cases like a directory with tens of thousands of files, etc. Also, I'm mainly using Linux but haven't felt ble.sh is too slow though I admit that ble.sh is not so fast. I have never tried macOS because I don't have it, but my naive guess is that the performance may be related to the default filesystem of the operating system.

akinomyoga commented 2 months ago

I've identified a bug in Bash that causes the larger memory footprint in recent versions of Bash. I submitted a patch:

https://lists.gnu.org/archive/html/bug-bash/2024-06/msg00000.html