Closed antonioetv closed 1 month ago
What are the results of the following commands (you can run them inside a Bash session without ble.sh)?
$ bash /path/to/ble.sh --version # <-- please replace "/path/to/ble.sh" with the path to the script file "ble.sh"
$ declare -p XDG_RUNTIME_DIR
$ id -u
$ ls -la /run/user/1000
$ mkdir -p /run/user/1000/blesh
$ ls -la /run/user/1000
$ echo t >| "/run/user/1000/blesh/$$.test.txt"
$ ls -la "/run/user/1000/blesh/$$".*
/home/antonio>bash .local/share/blesh/ble.sh --version
ble.sh (Bash Line Editor), version 0.4.0-devel4+b6344b3
/home/antonio>declare -p XDG_RUNTIME_DIR
declare -x XDG_RUNTIME_DIR="/run/user/1000/"
/home/antonio>id -u
1000
/home/antonio>ls -la /run/user/1000
total 0
drwx------ 5 antonio antonio 160 Mar 28 19:54 .
drwxr-xr-x 3 root root 60 Mar 28 19:54 ..
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 bus
drwx------ 2 antonio antonio 160 Mar 28 19:54 gnupg
drwxr-xr-x 2 antonio antonio 60 Mar 28 19:54 p11-kit
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 pipewire-0
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 pipewire-0-manager
drwxr-xr-x 5 antonio antonio 140 Mar 28 19:54 systemd
/home/antonio>mkdir -p /run/user/1000/blesh
/home/antonio>ls -la /run/user/1000
total 0
drwx------ 6 antonio antonio 180 Mar 28 19:55 .
drwxr-xr-x 3 root root 60 Mar 28 19:54 ..
drwxr-xr-x 2 antonio antonio 40 Mar 28 19:55 blesh
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 bus
drwx------ 2 antonio antonio 160 Mar 28 19:54 gnupg
drwxr-xr-x 2 antonio antonio 60 Mar 28 19:54 p11-kit
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 pipewire-0
srw-rw-rw- 1 antonio antonio 0 Mar 28 19:54 pipewire-0-manager
drwxr-xr-x 5 antonio antonio 140 Mar 28 19:54 systemd
/home/antonio>echo t >| "/run/user/1000/blesh/$$.test.txt"
/home/antonio>ls -la "/run/user/1000/blesh/$$".*
-rw-r--r-- 1 antonio antonio 2 Mar 28 19:55 /run/user/1000/blesh/127.test.txt
After this, I uncommented the source command in .bashrc, and it worked for that session.
/home/antonio>nano .bashrc
/home/antonio>bash
/home/antonio>echo "$BASH_VERSION ($MACHTYPE)"
5.2.26(1)-release (x86_64-pc-linux-gnu)
/home/antonio>echo "$BLE_VERSION"
0.4.0-devel4+b6344b3
But once I executed wsl --shutdown
and restarted the VM, I got the exact same errors.
I did this twice. It seems the blesh folder within /run/user/1000
gets deleted after every restart for some reason.
Thank you for the results!
I did this twice. It seems the blesh folder within
/run/user/1000
gets deleted after every restart for some reason.
Yeah, but it should be fine. ble.sh
is supposed to create the blesh
directory on its startup if it is not present. The real problem seems to be that ble.sh fails to create the blesh
directory on the startup, and in addition, ble.sh
fails to detect the failure.
I see... is there anything else I could do on my end to provide more information that would be helpful?
I'm now trying to look inside the code. I currently have no idea why it fails.
I see... is there anything else I could do on my end to provide more information that would be helpful?
Thanks. Could you try modifying the generated ble.sh
in the following way, and see what would be output on the startup? (Note: this is the modification for debugging, so you can revert it after checking the change).
--- ble.sh~^I2024-03-29 12:47:12.991139405 +0900
+++ ble.sh^I2024-03-29 12:51:04.838510981 +0900
@@ -1198,12 +1198,14 @@
fi
ble/base/.create-user-directory _ble_base_run "$tmp_dir/${USER:-$UID}@$HOSTNAME"
}
+set -x
if ! ble/base/initialize-runtime-directory; then
ble/util/print "ble.sh: failed to initialize \$_ble_base_run." 1>&2
ble/base/clear-version-variables
ble/init/clean-up 2>/dev/null # set -x 対策 #D0930
return 1
fi
+set +x
: >| "$_ble_base_run/$$.load"
function ble/base/clean-up-runtime-directory {
local opts=$1 failglob= noglob=
(Edit: Sorry I initially pasted the diff of the wrong part (for ble/base/initialize-cache-directory
). Now I fixed it to the diff of the part for ble/base/initialize-runtime-directory
.)
Okay so upon modifying the ble.sh
file, and then executing a bash session without restarting the VM, I get this.
/home/antonio>bash
++ ble/base/initialize-runtime-directory
++ ble/base/initialize-runtime-directory/.xdg
++ local runtime_dir=
++ [[ -n /run/user/1000/ ]]
++ [[ ! -d /run/user/1000/ ]]
++ [[ -O /run/user/1000/ ]]
++ runtime_dir=/run/user/1000/
++ [[ ! -n /run/user/1000/ ]]
++ [[ -r /run/user/1000/ ]]
++ [[ -w /run/user/1000/ ]]
++ [[ -x /run/user/1000/ ]]
++ ble/base/.create-user-directory _ble_base_run /run/user/1000//blesh
++ local var=_ble_base_run dir=/run/user/1000//blesh
++ [[ ! -d /run/user/1000//blesh ]]
++ [[ ! -e /run/user/1000//blesh ]]
++ [[ -h /run/user/1000//blesh ]]
++ [[ -e /run/user/1000//blesh ]]
++ [[ -h /run/user/1000//blesh ]]
++ umask 077
++ ble/bin/mkdir -p /run/user/1000//blesh
++ command mkdir -p /run/user/1000//blesh
++ [[ -O /run/user/1000//blesh ]]
++ builtin eval '_ble_base_run=$dir'
+++ _ble_base_run=/run/user/1000//blesh
++ return 0
++ set +x
/home/antonio>
Once I restart the VM, this is what I get:
I'm actually not that knowledgeable about programming so I hope despite my naivete this is useful anyway.
Thanks. As far as I see those results, the directory seems to be created. How about this change in the generated ble.sh
? What are output in the problematic session start?
--- ble.sh~^I2024-03-29 12:47:12.991139405 +0900
+++ ble.sh^I2024-03-29 14:40:15.750141003 +0900
@@ -1313,12 +1313,16 @@
ble/base/migrate-cache-directory/.move "$_ble_base_cache/man" "$_ble_base_cache/complete.mandb"
[[ $failglob ]] && shopt -s failglob
}
+ls -la /var/run/1000
if ! ble/base/initialize-cache-directory; then
ble/util/print "ble.sh: failed to initialize \$_ble_base_cache." 1>&2
ble/base/clear-version-variables
ble/init/clean-up 2>/dev/null # set -x 対策 #D0930
return 1
fi
+ls -la /var/run/1000
+(umask 077; ble/bin/mkdir -p /var/run/1000 && [[ -O /var/run/1000 ]]; echo "$?"; ls -la /var/run/1000)
+ls -la /var/run/1000
ble/base/migrate-cache-directory
function ble/base/initialize-state-directory/.xdg {
local state_dir=${XDG_STATE_HOME:-$HOME/.local/state}
Sorry for the delay in responding. Here's the results:
Thank you. Ah, there were typos in my previous reply. Also, the positions of the changes was wrong.
Could you try the following changes? Also, please revert the changes made by https://github.com/akinomyoga/ble.sh/issues/426#issuecomment-2026591860 and https://github.com/akinomyoga/ble.sh/issues/426#issuecomment-2026693326.
--- ble.sh~^I2024-03-29 14:40:15.750141003 +0900
+++ ble.sh^I2024-03-30 09:29:50.763755906 +0900
@@ -1198,12 +1198,16 @@
fi
ble/base/.create-user-directory _ble_base_run "$tmp_dir/${USER:-$UID}@$HOSTNAME"
}
+ls -la /run/user/1000
if ! ble/base/initialize-runtime-directory; then
ble/util/print "ble.sh: failed to initialize \$_ble_base_run." 1>&2
ble/base/clear-version-variables
ble/init/clean-up 2>/dev/null # set -x 対策 #D0930
return 1
fi
+ls -la /run/user/1000
+(umask 077; ble/bin/mkdir -p /run/user/1000 && [[ -O /run/user/1000 ]]; echo "$?"; ls -la /run/user/1000)
+ls -la /run/user/1000
: >| "$_ble_base_run/$$.load"
function ble/base/clean-up-runtime-directory {
local opts=$1 failglob= noglob=
This time, at the end I received no prompt like I usually do, but I couldn't type anything regardless.
Thank you. The directory blesh
seems to be correctly created on the startup of ble.sh. This means that some other programs or the system removes the directory at some point.
What is the result when you put the following line at the end of ~/.bashrc
(Please remove the temporary change to ble.sh by https://github.com/akinomyoga/ble.sh/issues/426#issuecomment-2027844915)?
# end of bashrc
declare -p BASH_SOURCE >/dev/tty
ls -la /run/user/1000 >/dev/tty
Here you go!
@antonioetv Thank you for your patience.
Also thank you for the results! This is very interesting... So far, what we know are
/run/user/1000/blesh
in a user command makes it work (as you reported in (comment)).ble.sh
successfully creates /run/user/1000/blesh
within its initialization, but it doesn't make it work./run/user/1000/blesh
still exists.Now my suspicion comes to some invisible characters contained in the path.
What would be the result when you put the following lines at the end of ~/.bashrc
?
# end of bashrc
ls -la /run/user/1000 | cat -v >/dev/tty
declare -p _ble_base_run | cat -v >/dev/tty
ls -la /run/user/1000/blesh | cat -v >/dev/tty
This is what comes out:
This also happens for me on WSL2 with Ubuntu 22.04.
When I start a new bash session, I first get:
ble.sh: XDG_RUNTIME_DIR='/run/user/1000/' is not a directory.
I can then still type into the prompt.
Then after a number of minutes, I get the same behaviour as this issue.
Thank you for the information!
ble.sh: XDG_RUNTIME_DIR='/run/user/1000/' is not a directory.
In this case, ble.sh is supposed to use another directory (which is not /run/user/1000
).
@geoffreyvanwyk What is the result of the following command in a ble.sh session after the above message is shown?
$ echo "$_ble_base_run"
Then after a number of minutes, I get the same behaviour as this issue.
@geoffreyvanwyk What is the precise message? Is that still /run/user/1000/...
? Or is it a different path?
I think it is probably a different directory. If that's the case, the problem is not specific to /run/user/1000
but can happen on any path.
What is the result of the following command in a ble.sh session after the above message is shown?
$ echo "$_ble_base_run"
It is:
/tmp/blesh/1000
I have to wait again to get the other precise message, but it looks similar to @antonioetv 's output.
Thanks! While waiting for another occurrence of the error, could you also check the result of the following command?
$ df -Th
Filesystem Type Size Used Avail Use% Mounted on
none tmpfs 3.9G 4.0K 3.9G 1% /mnt/wsl
none 9p 451G 291G 161G 65% /usr/lib/wsl/drivers
none tmpfs 3.9G 0 3.9G 0% /usr/lib/modules
none overlay 3.9G 0 3.9G 0% /usr/lib/modules/5.15.146.1-microsoft-standard-WSL2
/dev/sdc ext4 1007G 11G 946G 2% /
none tmpfs 3.9G 104K 3.9G 1% /mnt/wslg
none overlay 3.9G 0 3.9G 0% /usr/lib/wsl/lib
rootfs rootfs 3.9G 1.9M 3.9G 1% /init
none tmpfs 3.9G 892K 3.9G 1% /run
none tmpfs 3.9G 0 3.9G 0% /run/lock
none tmpfs 3.9G 0 3.9G 0% /run/shm
tmpfs tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
none overlay 3.9G 76K 3.9G 1% /mnt/wslg/versions.txt
none overlay 3.9G 76K 3.9G 1% /mnt/wslg/doc
C:\ 9p 451G 291G 161G 65% /mnt/c
Q:\ 9p 14G 12G 2.4G 84% /mnt/q
snapfuse fuse.snapfuse 73M 73M 0 100% /snap/core22/607
snapfuse fuse.snapfuse 75M 75M 0 100% /snap/core22/1122
snapfuse fuse.snapfuse 128K 128K 0 100% /snap/bare/5
snapfuse fuse.snapfuse 132M 132M 0 100% /snap/ubuntu-desktop-installer/1286
snapfuse fuse.snapfuse 40M 40M 0 100% /snap/snapd/21184
snapfuse fuse.snapfuse 92M 92M 0 100% /snap/gtk-common-themes/1535
snapfuse fuse.snapfuse 151M 151M 0 100% /snap/ubuntu-desktop-installer/939
snapfuse fuse.snapfuse 54M 54M 0 100% /snap/snapd/18933
Thank you. Even though I guessed it might be related to tmpfs
, /tmp
doesn't seem to be an independent filesystem of tmpfs in your system.
/dev/sdc ext4 1007G 11G 946G 2% / [...] none tmpfs 3.9G 892K 3.9G 1% /run
I think these lines are relevant.
So basically the tmpfs file system becomes unreachable at some point.
I have been setting up my WSL distro with Ansible. I think maybe the error of not being able to type in the prompt happened after I replaced the ~/.bashrc with Ansible. I am starting with a clean Ubuntu 22.04 image now.
Thanks for your help @akinomyoga
Thanks.
So basically the tmpfs file system becomes unreachable at some point.
I guessed so initially, but your replies seem to tell that it's unrelated to tmpfs. Your reply https://github.com/akinomyoga/ble.sh/issues/426#issuecomment-2047446079 implies that /tmp
is related, but another reply https://github.com/akinomyoga/ble.sh/issues/426#issuecomment-2047484576 implies that it is not tmpfs.
I'd still like to get the precise error message if you happen to face the error message. I'm waiting for it.
It looks like it is a bug in WSL https://github.com/microsoft/WSL/issues/9689
The error message I get sometimes when I cannot type into the prompt is:
-bash: /run/user/1000//blesh/259653.stderr: No such file or directory
It is repeated every time I type a character.
It looks like it is a bug in WSL microsoft/WSL#9689
Thanks for the information.
However, it's "shadowed" by the bind mount to /mnt/wslg/runtime-dir, despite the later being owned by another user. As such, the current user's run directory becomes inaccessible:
$ ls -ldn /mnt/wslg/runtime-dir "$XDG_RUNTIME_DIR" drwx------ 4 1000 1000 120 Feb 24 01:03 /mnt/wslg/runtime-dir drwx------ 4 1000 1000 120 Feb 24 01:03 /run/user/1234
If, however, the later mount is removed with sudo umount "$XDG_RUNTIME_DIR", then the "real" directory becomes "visible" and accessible and apps that require access to e.g. d-bus other sockets can now succeed:
$ ls -ldn /mnt/wslg/runtime-dir "$XDG_RUNTIME_DIR" drwx------ 4 1000 1000 120 Feb 24 01:03 /mnt/wslg/runtime-dir drwx------ 9 1234 1234 280 Feb 24 01:03 /run/user/1234
If we assume that this mounting happened temporarily in the initialization phase of ble.sh, this is consistent with the mysterious behavior @antonioetv has reported. It's also consistent if we assume the mounting happened at some point in your system.
The error message I get sometimes when I cannot type into the prompt is:
-bash: /run/user/1000//blesh/259653.stderr: No such file or directory
OK. So it's not /tmp/1000/blesh
but /run/user/1000
, which is consistent with the above observation.
Maybe I'll have to think about adding a workaround for WSL to avoid using this specific directory /run/user/*
. Or maybe this is an issue with UID
and/or the XDG_RUNTIME_DIR
.
@geoffreyvanwyk What are the results of the following commands?
$ echo "$UID"
$ echo "$EUID"
$ id -u
$ id -un
@geoffreyvanwyk What are the results of the following commands?
$ echo "$UID"
1000
$ echo "$EUID"
1000
$ id -u
1000
$ id -un
werker
@geoffreyvanwyk Thanks. In your case, the user ID is 1000, so the situation seems slightly different from the one in https://github.com/microsoft/WSL/issues/9689, where the user had a different user ID 1234. Anyway, WSL filesystems related to /run/user
seem to be broken. I'll consider adding a workaround for WSL.
It looks to me as if the missing /run/user/1000 directory is only an issue on distros I created based on the WSL image I downloaded directly from Ubuntu. I think the WSL distros found in the Microsoft app store might not have this problem. I will check.
Thank you for checking the details. That is an interesting observation. After your report, I have been searching for related issues. There seem to be even other reports that are caused by the problem of /run/user/
. They are reported around Sep. 2023 to Nov. 2023. I guess some changes in the WSL system around Sep. 2023 caused the problem.
mkdir /run/user/1000/: permission denied
after WSL 2.0.0 · Issue #10498 · microsoft/WSLserver_start:164
error in WSL (SOLVED: permissions issue) · Issue #26058 · neovim/neovimIf the problem doesn't arise in the latest version of WSL, possibly only the WSL images based on a problematic version of WSL released in Sep. 2023 have the problem.
@antonioetv @geoffreyvanwyk I have pushed a workaround for WSL in the master branch (commit fb826ab6). If the pushed code correctly detects WSL, ble.sh now falls back to /tmp/blesh/1000
, if /tmp
is available, or <path to ble.sh>/tmp/1000
(where 1000
is the UID of the current user).
@geoffreyvanwyk If you would like to continue to test the behavior of WSL's /run/user/1000
in different WSL images, you can use a commit before fb826ab6.
Thanks @akinomyoga.
I imported another distribution yesterday based on a Ubuntu 22.04.4 LTS image I backed-up last year in April, then installed ble.sh. I have not had any issues since.
Before I did that, I noticed that PsySH (an alternative PHP REPL) also has issues with the /run/user/1000
directory on the newer Ubuntu 22.04 images (https://cloud-images.ubuntu.com/wsl), when running psysh --color
.
I will test the /run/user/1000
behaviour on the Ubuntu 24.04 images.
@akinomyoga sorry for the delay. I finally got around to test ble.sh out on the latest Ubuntu 24.04 image. I first tried the Devel 0.4.0-devel3 release, but I got the same error. Then I switched to cloning the master branch (--recursive --depth 1
), making and installing. The error is gone now. I will see over the next few days.
Thank you!
I have not experienced this error again after more than a week of usage.
Thanks for the information! Probably the problem was fixed in the latest versions of WSL images.
Probably the problem was fixed in the latest versions of WSL images.
No, it is because of the work-around you implemented. I did test firstly without the workaround, without success.
OK, I thought you had been testing the issue of the WSL image itself by using the commit before my workaround. Thanks for testing ble.sh.
Since ble.sh has now a workaround and there don't seem to be any movements in the upstream repository for solving the root problem, let me close the issue. The upstream issue https://github.com/microsoft/WSL/issues/9689 was automatically closed after some time without any reactions from the developers. Then, a similar issue https://github.com/microsoft/WSL/issues/11542 seems to be now open along with older ones https://github.com/microsoft/WSL/issues/10473 and https://github.com/microsoft/WSL/issues/10498.
ble version: Latest as of 03/28/2024 (Cannot obtain exact version) Bash version: 5.2.26(1)-release (x86_64-pc-linux-gnu)
Hey there. I've been using ble.sh without any issues on Arch Linux (through WSL) for a while now, but as of a few days ago, whenever I start the WSL VM, I get this from the console:
After this, I get the usual bash prompt, but I can't type anything at all. It just won't let me. The only way to be able to type into the console again is to go to my .bashrc through the Windows File Explorer and comment out
source ~/.local/share/blesh/ble.sh
to disable it entirely.I've reinstalled ble.sh a few times already but the problem seems to persist. Any idea what could be causing it?