Open adrelanos opened 11 months ago
For Arch updates, I worked around the problem with a shell script that ran curl -C -
in an infinite loop, so that downloads succeeded eventually. This is not possible with other package managers, as they do not allow providing an arbitrary script to use for downloads.
I wasn't able to reproduce this yet on a real (non-Qubes) Debian bookworm yet.
Does that mean that Qubes Debian bookworm is any less real :laughing:?
More seriously, have you tried a Debian bookworm HVM without any Qubes packages installed? This should behave like non-Qubes Debian bookworm.
For Arch updates, I worked around the problem with a shell script that ran curl -C - in an infinite loop, so that downloads succeeded eventually.
Meaning you've been able to reproduce this bug?
I wasn't able to reproduce this yet on a real (non-Qubes) Debian bookworm yet.
Does that mean that Qubes Debian bookworm is any less real 😆?
Ha, indeed. Created
for it.
More seriously, have you tried a Debian bookworm HVM without any Qubes packages installed?
Tried now. Not reproducible in HVM.
Reproducible in PVH (Qubes default) but not HVM.
A user in Whonix forums reported this being reproducible also in a Debian 12 (bookworm) KVM VM (without Qubes involved).
In Summary.
reproducible here:
not reproducible here:
Affected virtualizers are Qubes PVH, (non-Qubes) KVM, (non-Qubes) VirtualBox. Not affected is real hardware (outside of any VMs).
What is the common factor (shared code base) in the affected virtualizers?
One might blame it on Tor / vanguards but that seems wrong. Their software is functional on real hardware. If it's broken in VMs it seems there is something wrong with the VMs. It was useful to report against Tor anyhow because the Tor developers might have insights on how this issue is triggered and might be able to provide workarounds.
For Arch updates, I worked around the problem with a shell script that ran curl -C - in an infinite loop, so that downloads succeeded eventually.
Meaning you've been able to reproduce this bug?
Yes, but only intermittently. Sometimes it works.
More seriously, have you tried a Debian bookworm HVM without any Qubes packages installed?
Tried now. Not reproducible in HVM.
Reproducible in PVH (Qubes default) but not HVM.
A user in Whonix forums reported this being reproducible also in a Debian 12 (bookworm) KVM VM (without Qubes involved).
In Summary.
reproducible here:
- real Debian 12 KVM
- Qubes Debian 12 based PVH App Qube
- Whonix (Debian 12 based) in VirtualBox (Non-Qubes-Whonix)
not reproducible here:
- real Debian 12 on hardware
Affected virtualizers are Qubes PVH, (non-Qubes) KVM, (non-Qubes) VirtualBox. Not affected is real hardware (outside of any VMs).
What is the common factor (shared code base) in the affected virtualizers?
Xen PVH and KVM share essentially no code, but they do share some behaviors:
One might blame it on Tor / vanguards but that seems wrong. Their software is functional on real hardware. If it's broken in VMs it seems there is something wrong with the VMs. It was useful to report against Tor anyhow because the Tor developers might have insights on how this issue is triggered and might be able to provide workarounds.
I highly doubt that this is a virtualizer problem.
Can you try in a (Podman/LXC/systemd-nspawn/etc) container? I suspect Tor is making assumptions about networking that simply do not hold in virtualized environments. For instance, expecting networking to be handled via DHCP could trigger a problem like this.
I think I got similar issue in https://openqa.qubes-os.org/tests/86485 (after suspend, if that matters). I don't see any errors in Tor log, but for vanguards I see:
Nov 27 12:30:47 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:47 2023]: Vanguards 0.3.1 connected to Tor 0.4.8.9 using stem 1.8.1
Nov 27 12:30:47 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:47 2023]: Tor needs descriptors: Cannot read /var/lib/tor/cached-microdesc-consensus: [Errno 2] No such file or directory: '/var/lib/tor/cached-microdesc-consensus'. Trying again...
Nov 27 12:30:47 host vanguards[3853]: WARNING[Mon Nov 27 12:30:47 2023]: Tor daemon connection failed: Cannot read /var/lib/tor/cached-microdesc-consensus: [Errno 2] No such file or directory: '/var/lib/tor/cached-microdesc-consensus'. Trying again...
Nov 27 12:30:48 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:48 2023]: Vanguards 0.3.1 connected to Tor 0.4.8.9 using stem 1.8.1
Nov 27 12:30:48 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:48 2023]: Tor needs descriptors: Cannot read /var/lib/tor/cached-microdesc-consensus: [Errno 2] No such file or directory: '/var/lib/tor/cached-microdesc-consensus'. Trying again...
Nov 27 12:30:49 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:49 2023]: Vanguards 0.3.1 connected to Tor 0.4.8.9 using stem 1.8.1
Nov 27 12:30:49 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:49 2023]: Tor needs descriptors: Cannot read /var/lib/tor/cached-microdesc-consensus: [Errno 2] No such file or directory: '/var/lib/tor/cached-microdesc-consensus'. Trying again...
Nov 27 12:30:50 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:50 2023]: Vanguards 0.3.1 connected to Tor 0.4.8.9 using stem 1.8.1
Then, after I restarted just tor@default.service
, I got this:
Nov 27 12:45:57 host vanguards[3853]: WARNING[Mon Nov 27 12:45:57 2023]: Tor daemon connection closed. Trying again...
Nov 27 12:45:58 host vanguards[3853]: NOTICE[Mon Nov 27 12:45:58 2023]: Vanguards 0.3.1 connected to Tor 0.4.8.9 using stem 1.8.1
And then systemcheck
is happy. I did not stopped nor restarted vanguards
. Maybe it's about service start order?
Restarting vanguards service is fixing the issue temporary but it's reappearing again after some time. Maybe it works only for Tor circuits existing when vanguards starts but not for newly created ones after some time is passed after vanguards was started.
Update: A user in the forums reported having reproduced this on hardware (outside of any VMs) too.
Additional reports about reproducibility on hardware would be appreciated.
Nov 27 12:30:49 host vanguards[3853]: NOTICE[Mon Nov 27 12:30:49 2023]: Tor needs descriptors: Cannot read /var/lib/tor/cached-microdesc-consensus: [Errno 2] No such file or directory: '/var/lib/tor/cached-microdesc-consensus'. Trying again...
This has always been like this.
Qubes OS release
Qubes R4.2
Summary
Downloads over Tor (with vanguards enabled) get interrupted after a few seconds.
This bug was introduced between Tor version
0.4.7.16-1
(from Debianbookworm
security repository) and Tor version0.4.8.9-1~d12.bookworm+1
(fromdeb.torproject.org
). I am certain that I could pinpoint it to it.The issue is only reproducible if
vanguards
is installed.The older Tor version from Debian
bookworm
security repository version0.4.7.16-1
does not have this issue.Steps to reproduce:
bookworm
Template.deb.torproject.org
sudo apt update
sudo apt install --no-install-recommends vanguards tor
/etc/tor/vanguards.conf
and changecontrol_socket =
tocontrol_socket = /run/tor/control
(related ticket)sudo systemctl enable vanguards
(potential Debian bug not being enabled by default)sudo systemctl restart tor@default
sudo systemctl restart vanguards
torsocks curl --fail --output /tmp/test.tar.xz https://dist.torproject.org/torbrowser/13.0.5/tor-browser-linux-x86_64-13.0.5.tar.xz
What is the current bug behavior?
Connection drops after a bit of continued file downloads.
What is the expected behavior?
No connection drops.
Environment
0.4.8.9-1~d12.bookworm+1
deb.torproject.org
bookworm
repository0.3.1-2.3
frompackages.debian.org
Also reproducible in Qubes-Whonix and Non-Qubes-Whonix (Whonix for VirtualBox). I wasn't able to reproduce this yet on a real (non-Qubes) Debian
bookworm
yet.Additional information
sudo systemctl stop vanguards && sudo systemctl restart tor@default
fixes this issue. This shows that this issue is only happening if Tor is combined with vanguards.Tor bug
Is this a Tor bug? Possibly, yes. Reported upstream:
Why am I reporting this against Qubes? Because in the past there was a similar Qubes bug:
Since this issue is reproducible in VMs (Qubes Debian App Qube) and in VirtualBox (Whonix) but not reproducible on real (non-Qubes) Debian, this implies that this might be a Qubes specific issue.
For issue tracking