[Discussion] Per-segment performance analysis

Powerlevel9k / powerlevel9k

Powerlevel9k was a tool for building a beautiful and highly functional CLI, customized for you. P9k had a substantial impact on CLI UX, and its legacy is now continued by P10k.

https://github.com/romkatv/powerlevel10k

MIT License

13.47k stars 948 forks source link

[Discussion] Per-segment performance analysis #1093

Open Maxattax97 opened 5 years ago

Maxattax97 commented 5 years ago

I've been interested in making my ZSH a little snappier, and so I did some performance analysis on how each segment effects the total time to display a fresh prompt. You can see a summary here on the wiki. It's all rough numbers, but I think it gives a good starting point.

One of the things I find interesting here that I'd like to improve, if feasible, is the amount of time it takes to display the current directory, as it's the 3rd most expensive segment I analysed. Curious what the rest of the community thinks about this, other segments, additional statistics, plan of attack, etc.

dritter commented 5 years ago

This is awesome. Thanks @Maxattax97 . I recently thought about this as well (https://github.com/bhilburn/powerlevel9k/issues/374#issuecomment-438872132). But first some questions about your method:

You measure the whole prompt rendering, right? Do you have configured all segments?
The 100 percent distribution is for all segments configured, so the point of this measurement is to say segment A is more expensive as segment B, no matter how fast segment A renders, right?
Identifying the dir segment as one of the most expensive, did you use a truncation strategy?

I thought about a bit different approach: We could measure the timings of each segment by its own, by calling the segment function directly (like I did here). So why not triggering this on every build, or as a cron job (travis FTW). These would be not super scientific as well, because Travis builds are executed on VMs, so IOPS would be on the floor. But we would get a prediction if the performance of Segment A got worse or better. To be able to compare the numbers, we would need to calculate a baseline anyway. Other downsides would be:

We cannot test all segments (the VMs won't have a battery..)
We won't measure other P9K crucial code (e.g. multiline output of segments)
Some segments depend on external things (like the performance of the VCS depends on the size of the repo)
Some segments are especially slow on certain OSes (like on Windows with WSL)

But I think this would be worth it. Having a grip on how the individual segments perform is a plus. And creating a nice graph on a website is another.

// cc @robobenklein and @Erowlin who helped on performance testing the VCS segment // cc @ijoseph for his awesome graphs in #374 (see here)

robobenklein commented 5 years ago

Addressing specifically the current directory segment, I actually use my own custom shortening function that I updated.

I don't call it 'truncating' though, I use 'reverse tabbing' (the tab-expanded result is unique to that dir): https://github.com/robobenklein/configs/blob/master/zsh/plunks/rtabfunc.zsh

@Maxattax97 Could you see if that function is either worse or better than the current implementation? If better, I could make a PR for it. It only uses one subshell call for the path exploration step, and would need to be changed from zstyle config to P9K vars.

As for the VCS segment, I gave up on improving the vcs_info zsh autoload any more and just ended up making that asynchronous. (I wanted to make anything involving an external call async.)

As for performance testing in Travis, I would recommend using the zsh profiler builtin. I use it often enough that I have a var ZSHRC_ZPROF that I can use to profile my personal configs.

(( ${+_ZSHRC_ZPROF} )) && zmodload zsh/zprof
# do stuff
zprof | head -30

Maxattax97 commented 5 years ago

To your 1, 2, and 3 my answer is yes, yes, and yes. Again, not as scientific as I'd like it to be :sweat_smile:.

I like everything @dritter just said, and I'm willing to contribute code. Those are some sexy graphs, and I've recently got my hands dirty with matplotlib and some of the statistical Python libraries (I'm currently studying data science).

Some thoughts:

If we use Z-scores for our performance analysis on Travis, the results will generalize a little better (minus the areas you've mentioned, such as battery or VCS)
For the segments like battery on Travis, maybe we can shim? Maybe this is not wise
I don't know if we can make Github pull the latest data and/or graphs from Travis builds, but worst case we can have Travis dump the results and we can copy them out and generate the graphs manually, then update the Wiki or some other doc
Comparing performance across ZSH versions would also be intriguing, I see that's set up for testing in the Travis matrix

@robobenklein's recommendation for the ZSH profiler sounds wise. I haven't explored it extensively, but I thought it could only be used for loading ZSH. Knowing now that it can do much more is great news.

I like the shell scripts @dritter wrote for testing performance VCS, I can see this becoming very simple very fast with a couple of shell functions.

dritter commented 5 years ago

My requirements to the graphs are:

They should be easy to read (like a simple bar graph, showing performance gets "better" or "worse", whatever that means in real time; maybe a "+2 seconds compared to last datapoint").
Should be triggered for every build/branch. That should be doable in Travis. After generating the charts, it could update the wiki (probably with the help of a GitHub bot.

Shimming some of the functions is a trade-off. It could be worth to know how long our own code runs, but it makes it a bit comparing apples to bananas (if we compare a shimmed segment with a not shimmed one, if that makes any sense "Segment A is 4 times slower than Segment B"). That is why I think we can only compare segment timinig across different runs/branches/ZSH versions.

Not sure if we need to fire up the whole zprof. On one hand, it would be interesting to see what exact function call takes up the most time, on the other hand, all that we need to generate a simple, easy readable graph is a timing of segment as a whole.

@Maxattax97 It would be awesome, if you could go ahead and produce some code!

Maxattax97 commented 5 years ago

It might be a little while before I can push code, finals week is coming up here at University, but I'm intrigued by this. Especially since it's a tool I'm using daily.

I've never used Github bots but if we could get at least a rough idea on paper of how to move data/graphs from Travis to the Wiki, I think that'd help in planning out some of the code.

Maxattax97 commented 5 years ago

@robobenklein I just ran some tests on rtab while I had a bit of spare time, and it seems to have reduced it's mean delta time to 1.48% (or 2.14% better than truncation) which makes it now the 6th most expensive (VCS is now 3rd at 2.06%), and the variance has decreased significantly as well. While rtab doesn't have the exact same functionality as truncation (I like having each folder collapsed to 3 characters), it its very close and IMO worthy of being pulled into the repository.

I should note that I wasn't able to get it to work exactly like in your own .zshrc, I made some slight modifications to get it working. This may effect performance.

    if (( ${+functions[rtab]} )); then
        echo "RTAB detected, swapping out default directory display with RTAB ..."
        # POWERLEVEL9K_CUSTOM_RTAB_DIR="echo \${RTAB_PWD}"
        POWERLEVEL9K_CUSTOM_RTAB_DIR="echo \$(rtab -l -t)"
        POWERLEVEL9K_CUSTOM_RTAB_DIR_FOREGROUND="${POWERLEVEL9K_DIR_DEFAULT_FOREGROUND}"
        POWERLEVEL9K_CUSTOM_RTAB_DIR_BACKGROUND="${POWERLEVEL9K_DIR_DEFAULT_BACKGROUND}"
        # TODO: Make this dynamically replace dir with custom_rtab_dir
        # POWERLEVEL9K_LEFT_PROMPT_ELEMENTS=(time vcs newline os_icon ssh custom_rtab_dir dir_writable)
        POWERLEVEL9K_LEFT_PROMPT_ELEMENTS=(custom_rtab_dir)
        # typeset -a chpwd_functions
        # chpwd_functions+=(_rtab_pwd_update)
        # function _rtab_pwd_update() {
            # export RTAB_PWD=$(rtab -l -t)
        # }
        # _rtab_pwd_update
    fi

    # typeset -gA p10k_opts
    # p10k_opts=(
        # p10ks_cwd ';;;;rtab;-t;-l'
    # )

I sourced your script earlier in my .zshrc:

    if [[ -e "$HOME/src/robobenklein-config/zsh/plunks/rtabfunc.zsh" ]]; then
        source "$HOME/src/robobenklein-config/zsh/plunks/rtabfunc.zsh"
    fi