zimfw / completion

Enables and configures smart and extensive tab completion.
MIT License
37 stars 10 forks source link

locale change causes .zcomdump* being deleted #13

Closed cattyhouse closed 1 year ago

cattyhouse commented 1 year ago

if the locale changes, e.g. from en_US.UTF-8 to C.UTF-8, e.g. ssh from archlinux and then ssh from alpine to the same machine, then https://github.com/zimfw/completion/blob/8e20c7b81f4b20cd1907d475716d7093be95d3ef/init.zsh#L24 will return 1, and the .zcomdump* got deleted, and then https://github.com/zimfw/completion/blob/8e20c7b81f4b20cd1907d475716d7093be95d3ef/init.zsh#L31 got run, which will be slow.

after https://github.com/zimfw/completion/blob/8e20c7b81f4b20cd1907d475716d7093be95d3ef/init.zsh#L23

the debug (set -x) shows

# log for [[ ${zold_dat} == ${znew_dat} ]]

[[ $'5.9\C-@/home/user/.zim/modules/zsh-completions/src/_afew\C-@... == 5.9/home/user/.zim/modules/zsh-completions/src/_afew... ]]

# the log is huge, so i use ... to replace the rest of all

as you can see the difference is \C-@ in there...

note: why locale changes sometimes via ssh? because 1) ssh is set to send locale and accept locale 2) alpine uses musl which uses C.UTF-8 and other OSes uses something else other than C.UTF-8

cattyhouse commented 1 year ago

steps to reproduce:

let's say machine A:

above is very normal configuration nowerdays.

now, we ssh from another machine to A:

  1. LANG="en_US.UTF-8" ssh A
  2. ls --full-time .zcompdump*
  3. exit
  4. LANG="C.UTF-8" ssh A
  5. ls --full-time .zcompdump*

the two ls --full-time .zcompdump* will show different timestamps

ericbn commented 1 year ago

Hi. Thanks for reporting this.

I've tried exporting LANG to different values in my machine and in an archlinux docker container and the .dat file didn't change. Can you try applying the patch below?

diff --git a/init.zsh b/init.zsh
index d7eb682..3ae58ae 100644
--- a/init.zsh
+++ b/init.zsh
@@ -20,15 +20,15 @@
   local -r znew_dat=${ZSH_VERSION}$'\0'${(pj:\0:)zcomps}$'\0'${(pj:\0:)zstats}
   if [[ -e ${zdumpfile}.dat ]]; then
     zmodload -F zsh/system b:sysread
-    sysread -s ${#znew_dat} zold_dat <${zdumpfile}.dat
     [[ ${zold_dat} == ${znew_dat} ]]; zdump_dat=${?}
+    LC_CTYPE=C sysread -s ${#znew_dat} zold_dat <${zdumpfile}.dat
   fi
   if (( zdump_dat )) command rm -f ${zdumpfile}(|.dat|.zwc(|.old))(N)

   autoload -Uz compinit && compinit -C -d ${zdumpfile}

   if [[ ! ${zdumpfile}.dat -nt ${zdumpfile} ]]; then
-    >! ${zdumpfile}.dat <<<${znew_dat}
+    zmodload -F zsh/system b:syswrite
+    LC_CTYPE=C syswrite ${znew_dat} >! ${zdumpfile}.dat
   fi
   # Compile the completion dumpfile; significant speedup
   if [[ ! ${zdumpfile}.zwc -nt ${zdumpfile} ]] zcompile ${zdumpfile}

Not even sure these commands would recognize the LC_CTYPE=C prefix. :- )

EDIT: Using syswrite to write the .dat file.

cattyhouse commented 1 year ago

the patch will cause .zcompdump* to be regenerated every ssh, slow on every ssh initiation

I've tried exporting LANG to different values in my machine and in an archlinux docker container and the .dat file didn't change

--> from en_US.UTF-8 to en_GB.UTF-8, it is fine, but to C.UTF-8 or zh_CN.UTF-8, it is NOT ok. you did not notice the change probably because your /etc/locale.gen did not enable those locales mentioned ( for me, ssh into alpine never has such issue, because alpine does not have any locale settings, it is always C.UTF8, no matter what ssh client's locale is). to reprocude, the condition needs to be met:

1) client's sshconfig has `SendEnv LANG LC 2) machine A's sshd_config hasAcceptEnv LANG LC_ 3) the locales mentioned e.g.C.UTF-8`en_US.UTF-8 are enabled in machine A's /etc/locale.gen and command locale-gen is run.

and then go to https://github.com/zimfw/completion/issues/13#issuecomment-1458051667

ericbn commented 1 year ago

I've updated the patch above. Writing was not working. Can you please try again with the updated patch?

cattyhouse commented 1 year ago

patching file init.zsh patch: **** malformed patch at line 23: if [[ ! ${zdumpfile}.zwc -nt ${zdumpfile} ]] zcompile ${zdumpfile}

cattyhouse commented 1 year ago

anyway, i hand edited the file according to your new patch, it is the same : the patch will cause .zcompdump* to be regenerated every ssh, slow on every ssh initiation

cattyhouse commented 1 year ago

so. i did something experimental to machine A (without your patch):

1) edit /etc/ssh/sshdconfig, comment out `AcceptEnv LANG LC*` (disable it), this will cause sshd to stick with it's locale instead of adapting from client. 2) restart sshd daemon

now, no matter how i ssh into A from whatever places, and whatever client , the ls --full-time .zcompdump* timestamps will not change, and the ssh is fast.

so i am pretty sure this issue is cause by locale, and the key question is, where is this \C-@ from when the locale changes?

# log for [[ ${zold_dat} == ${znew_dat} ]]

[[ $'5.9\C-@/home/user/.zim/modules/zsh-completions/src/_afew\C-@... == 5.9/home/user/.zim/modules/zsh-completions/src/_afew... ]]

# the log is huge, so i use ... to replace the rest of all
cattyhouse commented 1 year ago

i guess sysread or syswrite does not accept ENV variable. i did another experiment :


# Check if dumpfile is up-to-date by comparing the full path and
  # last modification time of all the completion functions in fpath.
  local _LANG=$LANG # store current LANG
  LANG=C # force LANG to C
.....

# Compile the completion dumpfile; significant speedup
  if [[ ! ${zdumpfile}.zwc -nt ${zdumpfile} ]] zcompile ${zdumpfile}
  LANG=$_LANG # restore LANG

that is to say, store current $LANG to _LANG, and set LANG to C at the beginning of the code, and restore LANG after the code.

this solved the problem.

but i think you can figure out a better solution.

ericbn commented 1 year ago

The \C-@ is equivalent to the $'\0' (null) character:

% xxd .zcompdump.dat
00000000: 352e 3900 2f68 6f6d 652f 7573 6572 2f2e  5.9./home/user/.
00000010: 7a69 6d2f 6d6f 6475 6c65 732f 7a73 682d  zim/modules/zsh-
00000020: 636f 6d70 6c65 7469 6f6e 732f 7372 632f  completions/src/
00000030: 5f61 6665 7700                           _afew.

I still didn't setup a ssh machine to try to reproduce the issue. Great to know you found a workaround and are very close to the solution. I wonder if the issue is when the file is read, when it's written, something else, or both...

cattyhouse commented 1 year ago

i've updated the steps to reproduce: https://github.com/zimfw/completion/issues/13#issuecomment-1458051667

cattyhouse commented 1 year ago

so i moved the pair LANG=C and LANG=en_US.UTF-8 around inside the code, and found that the issue is this line:

local -r zcomps=(${^fpath}/^([^_]*|*~|*.zwc)(N))

i've summited a pull request.

cattyhouse commented 1 year ago

Test:

LC_ALL=C zcomps_C=(${^fpath}/^([^_]*|*~|*.zwc)(N))

LC_ALL=en_US.UTF-8 zcomps_US=(${^fpath}/^([^_]*|*~|*.zwc)(N))

[[ $zcomps_C == $zcomps_US ]]

echo $?

returns 1