xlucn / fontpreview-ueberzug

preview fonts in fzf
MIT License
163 stars 11 forks source link

Use awk to speed the script up #15

Closed varagnac closed 3 years ago

varagnac commented 3 years ago

Use awk instead of while loop to speed the script up.

xlucn commented 3 years ago

Thanks for the PR. That seems to speed it up from 0.045 seconds down to 0.025 on my machine, but there is one thing: this way you are not excluding the font families that are not supported by convert.

So to be fair about the comparison, if I remove line 77 and 80 down here, I get really close speed as awk version, with 0.028 seconds (you can time it by remove the ending fzf command and loop the script for 10 - 100 times) https://github.com/OliverLew/fontpreview-ueberzug/blob/cceefdf2fa216cfc791ffb3364d89a973fb1d0c0/fontpreview-ueberzug#L77-L84

Although I only timed it on my machine, the data might not be representative.

varagnac commented 3 years ago

It is rather strange because the original code runs extremely slow on my machine:

fc-list -f "%{family}:%{style}:%{file}\n"  0.20s user 0.05s system 117% cpu 0.209 total
sort  0.03s user 0.00s system 0% cpu 10.130 total
uniq  0.17s user 0.21s system 3% cpu 11.298 total

while:

fc-list -f "%{family}:%{style}:%{file}\n"  0.20s user 0.03s system 111% cpu 0.205 total
sort  0.03s user 0.01s system 12% cpu 0.276 total
uniq  0.00s user 0.01s system 2% cpu 0.282 total
awk -F':'   0.06s user 0.01s system 25% cpu 0.287 total

This is the motivation for my pull request.

OK, this is way too slow. The problem is surely with zsh. So now with bash:

time fc-list -f "%{family}:%{style}:%{file}\n" | sort | uniq |  while IFS=: read -r family style file; do      [ "${families%${family%,*}*}" = "$families" ] && continue;      printf "%s%s\t%s\n" "${family%,*}" "${style:+:style=${style%,*}}" "$file";  done
real    0m1.428s
user    0m1.175s
sys 0m0.897s

But there is no output printed at all, even without time. And for the awk code:

real    0m0.335s
user    0m0.384s
sys 0m0.034s

With all the output printed.

I got 7522 fonts installed, by the way.

xlucn commented 3 years ago

I didn't think of other shells, those may be much slower than what I am using, which is dash. Also the power of cpu, speed of disk and number of fonts all could affect the execution time. However, I'm still shocked with 1.428s for just running the script once :)

So I am intrigued to change to awk. But can you filter the font families to only show those in convert -list font, as I pointed out before? Without this, some users are seeing errors #4 with a lot of unsupported fonts.

xlucn commented 3 years ago

I got 7522 fonts installed, by the way.

That explains all, I only have 200+ fonts in my system. I am not sure if that number is common for most users XD

I guess even 1000 fonts could cost an acceptable time even with the current version.

Maybe it now depends on how fast it will be if you added the font support check, that could cost some time for that large amount of fonts I think.

varagnac commented 3 years ago

In fact I'm really not sure what that convert line is doing, since it outputs about 1500 fonts here but at the end of the day the number of fonts are around 150 lesser than the code without the convert line (Oh, I see. Now I understand what that line is trying to do). Some fonts that are supported by convert are excluded, for example ZCOOL xiaowei, BM HANNA_TTF, Gaegu(style=light). The differences can be shown via

fc-list -f "%{family}:%{style}:%{file}\n" | sort | uniq |awk -F':' '{split ($1,a,",")} {split ($2,b,",")} {print a[1]":style="b[1] "\t" $3}' | awk -F':' '{print $1}' | sort | uniq -c > a.txt
families=$(convert -list font | sed -n 's/ *family: //p' | sort | uniq)
    fc-list -f "%{family}:%{style}:%{file}\n" | sort | uniq |
    while IFS=: read -r family style file; do
        [ "${families%${family%,*}*}" = "$families" ] && continue
        printf "%s%s\t%s\n" "${family%,*}" "${style:+:style=${style%,*}}" "$file"
    done | awk -F':' '{print $1}' | sort | uniq -c> b.txt
diff b.txt a.txt
xlucn commented 3 years ago

In fact I'm really not sure what that convert line is doing

It's not a perfect solution but I used that to exclude most of the fonts (.pcf fonts and alike) that would return error in the other convert command which generates the preview image, see the issue I linked above, where that user reported 75% of the fonts are not previewable. And, convert should support (most of) the fonts from convert -list font, obviously.

If I (or with other's help) can preview any font from fc-list, then that is not need anymore.

Maybe this PR could wait until better font support is here, if it's not compatible with the current approach.

varagnac commented 3 years ago

It seems that not all fonts listed by convert -list fonts can actually be read. For example, .otb formatted Terminus fonts are listed, but convert cannot read them. Maybe just pipe the output to grep -E 'ttc|otf|ttf' will be good enough?

xlucn commented 3 years ago

Maybe just pipe the output to grep -E 'ttc|otf|ttf' will be good enough?

That's brilliant actually, it's much simpler. We can do this instead of comparing with convert -list font.

It seems that not all fonts listed by convert -list fonts can actually be read.

Yes. It seems it's more about font formats.

So if you could add that into the pipeline and it would turn out to do the job, I'm quite happy to merge this PR :) The code format should be improved though, i.e., put awk and fzf into new lines.

xlucn commented 3 years ago

I merged it and will add the grep filter. Thanks for the PR and the idea!

xlucn commented 3 years ago

Just found fontconfig's cli commands has built-in format feature (FcPatternFormat) to do this, which is quite interesting. So even awk is not needed now.