cdown / clipmenu

Clipboard management using dmenu
MIT License
1.1k stars 90 forks source link

Speed enhancement #180

Closed nick87720z closed 4 months ago

nick87720z commented 2 years ago

test-perf

clipmenu

clipmenud

Also clipmenu and clipctl are changed for posix shell compatibility, which also means higher speed if using dash (told to be fastest unix shell, in my earlier tests reached near 4x difference, though depends on usage).

Though doing same for daemon has little sence due to this bug.

nick87720z commented 2 years ago

What do you think about such pipeline for clipmenu:

LC_ALL=C sort -ruk 2 < "$cache_file" | LC_ALL=C sort -rnk 1 | sed 's/^[^ ]* //'

When I tested it using test-perf (which was not my first way for perf test), it was 10 times slower, than variant, using awk for deduplication. What's interesting, all slowdown was due to second sort, getting complete mess from first sort (in terms or ordering), while original input is already sorted, just needs to be reversion (curious, that tac is even slightly slower than sort).

However, in previous attempt I used a bit smaller line_cache file - just took existing file, created by regular use rather than perf test, and appended to itself multiple times - 10 or 100 (into separate file, of course). There are my preserved results:

# Performance test:
#time for i in $(seq 1000); do { LC_ALL=C sort -rnk 1 | cut -d' ' -f2- | awk '!seen[$0]++'; } < ./_line_cache_test; done >/dev/null
#time for i in $(seq 1000); do { LC_ALL=C sort -ruk 2 | LC_ALL=C sort -rnk 1 | cut -d' ' -f2-; } < ./_line_cache_test; done >/dev/null
# Run with line_cache size (wc output): 9999  50904 530048 _line_cache_test
# Dry run (feeding empty string instead of file): 5.1 -> 4.632 
# qalc in:  10.189 / 19.589, 4.632 / 5.1, (10.189−4.632) / (19.589−5.1), 10.189−4.632, 19.589−5.1
# qalc out: [0.5201, 0.9082, 0.3835, 5.557, 14.49]
# I.e., overhead old to new multiplier - 0.9082 and for text processing - 0.3835.

Can't understand - could it be if second sort was just in better situation, when input cache has just same relatively small chunk, repearing 10 to 100 times, rather than everything truly random?

Anyway, dry run showes, this variant should be faster to start than that, with awk dedup in the end.

nick87720z commented 2 years ago

3 more ways for awk dedup:

LC_ALL=C sort -rnk 1 < "$cache_file" | awk -v FS='^[^ ]* ' '!seen[$2]++ {print $2}'
LC_ALL=C sort -rnk 1 < "$cache_file" | sed 's/^[^ ]* //' | awk -v FPAT='^.*$' '!seen[$0]++'
LC_ALL=C sort -rnk 1 < "$cache_file" | sed 's/^[^ ]* //' | awk -v FS='\n' '!seen[$0]++'

Must be some difference, but it's just invisible. Tested with 20000 lines cache.

nick87720z commented 2 years ago

The reason for variant with two sorts to be slow - sort is not so fast with stdin. May be solved by using temporary file, however it will eat space. At cache sizes, when it really matters, it could exceed free disk space - test file for 40000 lines ate 280 Mb, leaving no free space in XDG_RUNTIME_DIR, which for me is just 394 Mb.

This snippet has ≈7/5 times higher performance:

LC_ALL=C sort -ruk 2 < "$cache_file" > "$cache_file-a"
LC_ALL=C sort -rnk 1 < "$cache_file-a" | sed 's/^[^ ]* //'
cdown commented 2 years ago

Hi! Sorry to be late getting back.

In general, we're not going to move away from bash with the convenience provided by bash arrays and other features, and readability is preferred over small performance improvements.

Maybe you can conslidate on a few changes which are particularly pertinent for review? As it is, these are a lot of changes, but performance hasn't been a large concern brought up about clipmenu in the past.

cdown commented 4 months ago

clipmenu is now in C, so closing since this applies to the old bash version. Thanks for your work on this, though!