petobens / trueline

Fast and extensible bash powerline prompt with true color and fancy icon support
MIT License
385 stars 36 forks source link

Not actually so fast #17

Closed p4vook closed 4 years ago

p4vook commented 4 years ago

Welcome is rendering after the command very slowly, so I can usually type a letter or two while rendering.

petobens commented 4 years ago

I you provide exact steps to reproduce I can try to find the bottleneck. It should be way faster than python powerline.

p4vook commented 4 years ago

Thank you very much for such fast reply! I just cloned the git repo and added source ~/Git/trueline/trueline.sh to my .bashrc. My bashrc is kind of clean:

[[ $- != *i* ]] && return

export PATH=~/Apps/:$PATH
export EDITOR=vim
alias ls='exa -a' $*
source ~/Git/trueline/trueline.sh

That's my environment.

                  -`                    paulk@pavelthebest 
                  .o+`                   ------------------ 
                 `ooo/                   OS: Arch Linux x86_64 
                `+oooo:                  Host: HP Pavilion Laptop 14-bf1xx 
               `+oooooo:                 Kernel: 5.4.4 
               -+oooooo+:                Uptime: 1 hour, 1 min 
             `/:-:++oooo+:               Packages: 1317 (pacman) 
            `/++++/+++++++:              Shell: bash 5.0.11 
           `/++++++++++++++:             Resolution: 1920x1080 
          `/+++ooooooooooooo/`           DE: Plasma 
         ./ooosssso++osssssso+`          WM: KWin 
        .oossssso-````/ossssss+`         Theme: Breeze-Dark [GTK2/3] 
       -osssssso.      :ssssssso.        Icons: breeze-dark [GTK2/3] 
      :osssssss/        osssso+++.       Terminal: konsole 
     /ossssssss/        +ssssooo/-       Terminal Font: Cascadia Code PL 13 
   `/ossssso+/:-        -:/+osssso+-     CPU: Intel i7-8550U (8) @ 1.800GHz 
  `+sso+:-`                 `.-/+oso:    GPU: Intel UHD Graphics 620 
 `++:.                           `-/+/   GPU: NVIDIA GeForce 940MX 
 .`                                 `/   Memory: 5321MiB / 15929MiB

I measured: it renders about 100 ms, while powerlevel10k (for zsh) renders about 9 ms.

petobens commented 4 years ago

Powerlevel10k is indeed faster because it has tons of optimizations which trueline doesn't (I could probably include the git optimizations here but need to think about it because I want to keep this as simple as possible). How is that you are measuring? I'm using hyperfine and getting:

hyperfine --warmup 5 -r 30 "/usr/bin/bash --rcfile ~/Desktop/min_bashrc2.sh -i -c 'echo -n'"
Benchmark #1: /usr/bin/bash --rcfile ~/Desktop/min_bashrc2.sh -i -c 'echo -n'
  Time (mean ± σ):      46.3 ms ±  14.6 ms    [User: 42.1 ms, System: 3.8 ms]
  Range (min … max):    14.3 ms …  80.1 ms    30 runs

where the min_bashrc2.sh file simply loads trueline.

p4vook commented 4 years ago

I use this benchmarking command:

perf stat -ddd bash -i -c "echo -n"

 Performance counter stats for 'bash -i -c echo -n':

              7.21 msec task-clock:u              #    0.951 CPUs utilized          
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
               273      page-faults:u             #    0.038 M/sec                  
         2,735,482      cycles:u                  #    0.380 GHz                      (2.00%)
        19,340,278      instructions:u            #    7.07  insn per cycle           (43.67%)
         5,585,796      branches:u                #  775.175 M/sec                    (85.42%)
           116,571      branch-misses:u           #    2.09% of all branches        
         6,424,983      L1-dcache-loads:u         #  891.634 M/sec                  
           203,327      L1-dcache-load-misses:u   #    3.16% of all L1-dcache hits  
             5,431      LLC-loads:u               #    0.754 M/sec                    (14.58%)
     <not counted>      LLC-load-misses:u                                             (0.00%)
   <not supported>      L1-icache-loads:u                                           
     <not counted>      L1-icache-load-misses:u                                       (0.00%)
     <not counted>      dTLB-loads:u                                                  (0.00%)
     <not counted>      dTLB-load-misses:u                                            (0.00%)
     <not counted>      iTLB-loads:u                                                  (0.00%)
     <not counted>      iTLB-load-misses:u                                            (0.00%)
   <not supported>      L1-dcache-prefetches:u                                      
   <not supported>      L1-dcache-prefetch-misses:u                                   

       0.007579162 seconds time elapsed

       0.007510000 seconds user
       0.000000000 seconds sys

UPD: I get even worse results when using rcfile right from the repository:

perf stat -ddd bash --rcfile ~/Git/trueline/trueline.sh -i -c "echo -n"

 Performance counter stats for 'bash --rcfile /home/paulk/Git/trueline/trueline.sh -i -c echo -n':

             13.63 msec task-clock:u              #    0.960 CPUs utilized          
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
               270      page-faults:u             #    0.020 M/sec                  
         7,777,657      cycles:u                  #    0.570 GHz                      (14.31%)
        16,689,186      instructions:u            #    2.15  insn per cycle           (36.29%)
         3,556,214      branches:u                #  260.825 M/sec                    (58.26%)
           115,589      branch-misses:u           #    3.25% of all branches          (79.94%)
         6,338,851      L1-dcache-loads:u         #  464.914 M/sec                  
           202,392      L1-dcache-load-misses:u   #    3.19% of all L1-dcache hits  
             4,968      LLC-loads:u               #    0.364 M/sec                    (41.74%)
             1,246      LLC-load-misses:u         #   25.08% of all LL-cache hits     (20.06%)
   <not supported>      L1-icache-loads:u                                           
     <not counted>      L1-icache-load-misses:u                                       (0.00%)
     <not counted>      dTLB-loads:u                                                  (0.00%)
     <not counted>      dTLB-load-misses:u                                            (0.00%)
     <not counted>      iTLB-loads:u                                                  (0.00%)
     <not counted>      iTLB-load-misses:u                                            (0.00%)
   <not supported>      L1-dcache-prefetches:u                                      
   <not supported>      L1-dcache-prefetch-misses:u                                   

       0.014203336 seconds time elapsed

       0.010616000 seconds user
       0.003411000 seconds sys
petobens commented 4 years ago

So you are getting 7.5ms with a clean bashrc file and 106ms with trueline?

p4vook commented 4 years ago

Well, I cannot count zeroes... :) 14 ms with trueline is not such bad. I actually got 100 ms result, when counted time it takes to render a full windows of prompts (in my case that's 45 lines). I got about 5 seconds. I don't actually know, why benchmark results are different...

petobens commented 4 years ago

Hahah I also got confused. Anyways try running tests with hyperfine (it's pretty neat and tends to give consistent results)... We can however leave this issue open to report further bottlenecks (particularly with git repositories where we can definitely try to borrow some tricks from powerlevel10k)

p4vook commented 4 years ago

I get about 3 ms using hyperfine and that's kind of confusing. I also realized that I can also type a letter during rendering using powerlevel. Actual rendering time is 14 ms, and that's very good. The tests I performed at first were wrong. I even think that borrowing some tricks from powerlevel10k can make the prompt faster than powerlevel. Thank you for developing such amazing prompt!

petobens commented 4 years ago

Ok. Let close this one then. Feel free to open a another issue to see if we can steal some fancy stuff from powerlevelk10 (if needed). Happy new year.