ValveSoftware / SteamVR-for-Linux

Issue tracker for the Linux port of SteamVR
930 stars 45 forks source link

[BUG] Half-Life: Alyx - hang on start "Timed out waiting for response from Mongoose" #332

Open pwaller opened 4 years ago

pwaller commented 4 years ago

Describe the bug A clear and concise description of what the bug is.

When starting Half-Life: Alyx, I see the grey loading scene indefinitely in the headset. On the desktop, I see:

Screenshot from 2020-04-10 19-45-49

Timed out waiting for response from Mongoose SteamVR needs to be restarted.

To Reproduce Steps to reproduce the behavior:

  1. Launch SteamVR
  2. Launch Half-Life: Alyx
  3. See error.
  4. GOTO 1. (i.e, the error is repeatable)

Expected behavior

As of a few hours ago, this was working and I was able to play Alyx happily. But now I don't seem to be able to start it.

System Information (please complete the following information):

Sgt-Schultz commented 4 years ago

I had this error and I fixed it by verifying the files for both hla and steamvr

pwaller commented 4 years ago
  1. I tried deleting the compatdata. No luck there.
  2. I tried switching to "Steam linux runtime" in "Force the use of a specific compatibility tool". No luck there.
  3. After switching in stage 2, I tried switching back to "Proton 5.0-5" in "Force the use of a specific compatibility tool". Now the game starts. I don't know if this is what fixed it, but it seems possible.
pwaller commented 4 years ago

Quitting Alyx, doing room recalibration, and then re-entering lead to exactly the same issue as before. A reboot later, and I was able to start Alyx. So it seems to be non-deterministic now. It was repeatable at least several times before, but now it seems every other time I can start it (I have only tried a handful of times so far, will update).

cirk2 commented 4 years ago

can you check if your vrwebhelper is crashing in the background? (I can monitor that pretty well in journalctl -f) That was the cause for me to experience the mongoose error, see #278 for the mesa/amd specific issue and workaround.

pwaller commented 4 years ago

Error still happening. No apparent crashes in dmesg or journalctl.

It seems that if I just try starting the game repeatedly, it eventually succeeds?

pwaller commented 4 years ago

Hmm. Seems I spoke too soon. It gave me a "Press trigger to start" (which usually it doesn't, because it has crashed before that), but subsequently didn't load in reasonable time, just leaving me in the grey initial area (before the logo).

lorendias commented 4 years ago

journalctl -f yields

      Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages.
      Pass -q to turn off this notice.
^[[A-- Logs begin at Fri 2020-04-17 23:27:43 PDT. --
May 01 19:24:38 SteamBox /usr/lib/gdm-x-session[713]: (II) AMDGPU(0): Modeline "1440x900"x0.0   88.75  1440 1488 1520 1600  900 903 909 926 +hsync -vsync (55.5 kHz e)
May 01 19:24:38 SteamBox /usr/lib/gdm-x-session[713]: (II) AMDGPU(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz e)
May 01 19:24:38 SteamBox /usr/lib/gdm-x-session[713]: (II) AMDGPU(0): Modeline "1680x1050"x0.0  119.00  1680 1728 1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz e)
May 01 19:24:38 SteamBox /usr/lib/gdm-x-session[713]: (--) AMDGPU(0): HDMI max TMDS frequency 300000KHz
May 01 19:24:39 SteamBox crash_20200501192439_1.dmp[2552]: Uploading dump (out-of-process)
                                                           /tmp/dumps/crash_20200501192439_1.dmp
May 01 19:24:39 SteamBox crash_20200501192439_1.dmp[2552]: Finished uploading minidump (out-of-process): success = yes
May 01 19:24:39 SteamBox crash_20200501192439_1.dmp[2552]: response: CrashID=bp-a4cc7d16-4551-4a91-a127-40bf82200501
May 01 19:24:39 SteamBox crash_20200501192439_1.dmp[2552]: file ''/tmp/dumps/crash_20200501192439_1.dmp'', upload yes: ''CrashID=bp-a4cc7d16-4551-4a91-a127-40bf82200501''
May 01 19:24:39 SteamBox systemd-coredump[2556]: Process 2525 (vrmonitor) of user 1000 dumped core.

                                                 Stack trace of thread 2525:
                                                 #0  0x00007f647534fce5 raise (libc.so.6 + 0x3bce5)
                                                 #1  0x00007f6475339857 abort (libc.so.6 + 0x25857)
                                                 #2  0x00007f64769be938 _ZNK14QMessageLogger5fatalEPKcz (libQt5Core.so.5 + 0x8e938)
                                                 #3  0x00007f64763ad845 _ZN22QGuiApplicationPrivate25createPlatformIntegrationEv (libQt5Gui.so.5 + 0x12b845)
                                                 #4  0x00007f64763adce1 _ZN22QGuiApplicationPrivate21createEventDispatcherEv (libQt5Gui.so.5 + 0x12bce1)
                                                 #5  0x00007f6476bdbea5 _ZN23QCoreApplicationPrivate4initEv (libQt5Core.so.5 + 0x2abea5)
                                                 #6  0x00007f64763b0e80 _ZN22QGuiApplicationPrivate4initEv (libQt5Gui.so.5 + 0x12ee80)
                                                 #7  0x00007f6475d43d2f _ZN19QApplicationPrivate4initEv (libQt5Widgets.so.5 + 0x161d2f)
                                                 #8  0x000000000062bf1c _Z8RealMainiPPc (vrmonitor + 0x22bf1c)
                                                 #9  0x000000000043672c main (vrmonitor + 0x3672c)
                                                 #10 0x00007f647533b023 __libc_start_main (libc.so.6 + 0x27023)
                                                 #11 0x0000000000436d3d _start (vrmonitor + 0x36d3d)
May 01 19:25:30 SteamBox systemd[680]: Started VTE child process 2697 launched by gnome-terminal-server process 915.

crash_20200501192439_1.dmp.txt

Archlinux Steam-Native AMD VEGA 64

Edit: After switching from $steam-native to $steam, validating SteamVR and Half Life: Alyx, and making sure it was using Proton 5.xx instead of 4.xx it worked with having only other occasional issues.

Zamundaaa commented 4 years ago

Tried the native version just now, same error right on the first start. Edit: I had mesa_glthread set, removing that made it work.

Termuellinator commented 4 years ago

i got the issue with the native version, too - even with mesa_glthread set to false. modifying SteamVR/bin/vrwebhelper/linux64/vrwebhelper.sh to include "export VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/amd_icd64.json" did somewhat work for me. First tried to run alyx with amdvlk, too - no error but did not finish loading the game beyond main menu. But vrwebhelper with amdvlk and alyx with radv seems to work (but only on the second try) Edit: only worked one time, now i'm back to fiddling around...:(

arcriley commented 4 years ago

Screenshot from 2020-07-17 20-34-17

Three months after this bug was posted it remains unassigned and unaddressed.

jalabb commented 4 years ago

I have had this issue as well for a few days. The game ran fine before, I have about 4 hours on it. Validating didn't work Reinstalling SteamVR and Alyx didn't work Switching to Proton didn't work at all, the game crashed immediately gist

EDIT : When I try to run Alyx direclty without launching SteamVR first I get a different error message : image

jalabb commented 4 years ago

After the latest SteamVR beta update (1.15.2), Alyx now launches normally again.

IAmV0id commented 4 years ago

I've also had this happen too whenever my game crashes loading a new area @jalabb, although I can still get pass after a few attempts of loading in.

sirhandel commented 4 years ago

I had the same problem - after 18 hours of play it was hanging on the rubble start scene and getting the Mongoose message. Reinstalled Alyx, deinstalled antivirus and VPN all to no avail so contacted Steam support. They requested a system report (open SteamVR, create system report and save) and then came back with the attached suggestions. Verifying integrity of tool files identified 3 unverifiable files which were automatically reaquired and ALYX NOW WORKS.

steamvr

Hope it works for you.

LubosD commented 3 years ago

I was too plagued by this, so I started analyzing what's happening in there. In doing so, I noticed some rejected TCP connections to the "VR whatever backend" going to [::1]:27062, with other successful connections going over IPv4 to 127.0.0.1:27062.

So I had an idea that this was a consequence of first binding to IPv4 and then failing to bind IPv6 because the port is already taken, so I did:

sudo sysctl net.ipv6.bindv6only=1

restarted SteamVR and tada: Half Life started up.

As this bug is rather non-deterministic, I'd be happy if someone else could confirm whether this helps or I just got lucky.

If it really helps, Valve should fix how they do binding. If they rely on this behavior, they should do something like

setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, ....);

to enable it on per-socket level or even better: always try binding on IPv6 first and only then fail over to IPv4.

EliteTK commented 3 years ago

I am also having this issue now. The first time I started steamvr it didn't occur but the occurrence rate is high and very rarely can I get Half-Life Alyx to start.

I'm running SteamVR on voidlinux and the last SteamVR version I tried was beta 1.19.2 but this bug has occurred on every version I have tried. I have a Valve Index and a RX 5700XT. I'm using mesa 21.1.5 and only radv is installed.

I have tried the following with no luck:

libclient.so seems to be written in C++ so the result in ghidra was quite messy but there was some pattern of calling one function with some kind of message type and then calling another function. The function which posts the error message was doing it twice. The first time it sent a message "show_message" to hlvr/interstitials. Then it sends something else (one of the parameters is "text" but the message itself gets vsnprintf-ed so it's difficult to figure out what is being sent.

I noticed a pattern in vrwebhelper_main logs in .steam/steam/logs. I think the instances where alyx was able to start don't end up with the error: "Unhandled message of type show_message was sent to hlvr/interstitials, but there was nothing to start"

It would be nice to get some help debugging this. Really it would be great to get more logs of what's happening.

I have a strace of a successful Half-Life Alyx run and a strace of an unsuccessful one. I tried comparing the straces but it's a lot of tedious work.

I'm considering trying on a different distro next. Is Half-Life Alyx relying on something systemd specific by any chance? I know that vrstartup.sh relies on pkexec, but I've patched that out locally on my system by implementing pkexec with doas and configuring doas to allow the exact command without a password. (and it works fine)

Edit: I forgot to mention, this didn't appear to happen via proton when I tested but maybe I just got lucky. That being said, proton performance at the settings that work for linux was just not acceptable. Specifically there was some weird desync between the eyes which just made me see double.

EliteTK commented 3 years ago

I should also note. The game seems to start correctly more often if I let SteamVR run for a 15 minutes before I start Half-Life Alyx.

It would be nice to have documentation of how this whole thing goes together. Especially a diagram of what connections are made to what. So I can debug the right things and ensure the system supports the configuration.

Edit: Github wouldn't let me attach the "System Report" so I've uploaded it here: https://the-tk.com/shit/steamvr-system-report

I've also forgotten to mention the CPU I am using in case it may cause problems (what with the high core count and all): Ryzen 9 5950X

MattKercher commented 3 years ago

Valve is such a champion of Linux that they forgot to make their games work on the platform.

sankasan commented 3 years ago

@kisak-valve, is there anything you can do to get some developer support based on all the information gathered in this/these threads?

Wandang commented 3 years ago

I got this answer on reddit concerning this bug:

My friend encountered the same issue on his system yesterday, and found out that downgrading his (Arch) system back to August 20th solved the problem.

Upon further investigation, the culprits turned out to be freetype2 and lib32-freetype2 - downgrading those two to version 2.10.4-1 seems to have fixed the issue for him.

Let me know if this helps!

Manjaro is not recommending downgrading with a ALA (Arch Linux Archive) version. So I am not able to test this. Maybe some1 else can give feedback if this worked for him/her

frostworx commented 3 years ago

I currently don't have the linux version of Half-Life Alyx installed to test, but maybe the game also does have a start script where you can expand the LD_LIBRARYPATH similar like here (another vr issue, where using an older freetype-2.10.4 is a valid workaround)_

Wandang commented 3 years ago

@frostworx Where would I get the required .so files? I downloaded freetype-2.10.4.tar.gz from the original freetype project and ran make inside of it. I got several files under build/unix, but no .so files (last time I wrote makefiles and so forth is >10 years ago)

Edit: Found them lying in the objs/.libs/ folder

Unfortunately this didn't work:

Symlinks

change in vrwebhelper.sh:

export LD_LIBRARY_PATH="${DIR}:${STEAM_RUNTIME_HEAVY}${LD_LIBRARY_PATH+:$LD_LIBRARY_PATH}"

Results in the same outcome (mongoose error). And the settings menu from the vrcontrols (burgermenu) still don't work. Maybe I did something wrong?

What changed though is that I got an steam overlay informing me about my controllers

frostworx commented 3 years ago

@Wandang, I simply used the package from /var/cache/pacman/pkg/freetype2-2.10.4-1-x86_64.pkg.tar.zst, maybe you have the file as well when using Manjaro. The two symlinks need to be valid and point to the shard object of course. Also, I meant that you might have to edit a possibly existing Half Life Alyx start script and not the vrwebhelper.sh to fix Half Life, but I can't tell if the game even does have/use a startscript, as I have the windows version installed currently. Good luck!

Wandang commented 3 years ago

I found the hlvr.sh script which seems to be the correct startup script (path is steamapps/common/Half-Life Alyx/game/):

GAMEROOT=$(cd "${0%/*}" && echo $PWD)
...
elif [ "$UNAME" == "Linux" ]; then
   # prepend our lib path to LD_LIBRARY_PATH
   export LD_LIBRARY_PATH="${GAMEROOT}"/bin/linuxsteamrt64:$LD_LIBRARY_PATH
   USE_STEAM_RUNTIME=1
fi

linuxsteamrt64 can be found in game/bin/

The important part is shown. I created symlinks and cped the so.6.17.4 file like we did with the vrwebhelper folder for the linuxsteamrt64. I am still trying to figure out the syntax to change to the correct pathing.

Thanks for helping so far frostworx!

sankasan commented 3 years ago

Thanks for sharing this @Wandang and @frostworx!

After copying the older freetype2 version (including symlinks) and modifiying the script as described in the post frostworx shared as well as adding the same freetype2 version (including symlinks) to ~/.local/share/Steam/steamapps/common/Half-Life Alyx/game/bin/linuxsteamrt64/ I've been able to launch the game successfully for first time in a long while.

I'm not sure if both are needed but it works now so I'm not touching this again ;)

You can find the older package of freetype2 for arch here: https://archive.archlinux.org/packages/path/freetype2-2.10.4-1-x86_64.pkg.tar.zst

sankasan commented 3 years ago

Although I was able to launch the game a couple of times the fix is clearly not a 100%. First few times it worked.. now 3 startups in a row failed.

EliteTK commented 3 years ago

This doesn’t surprise me. I have this issue intermittently on voidlinux and it was definitely when voidlinux was still packaging the “working” freetype version.

Please try my solution of leaving steamvr running for 15 minutes before trying to launch half life alyx.

Whatever this issue is, the environment is probably affecting it but these supposed fixes likely just change the frequency of occurrence. If I had to put money on it it would be some kind of race.

When I spoke with kisak he mentioned that he saw the issue only once that people at valve were never able to recreate it.

Wandang commented 3 years ago

I can confirm that using the archive version that @sankasan linked the game works again! I just ended a 2 hour session without problems. I will report back if the issue reappears over the coming days

Edit: So far I had 2 more sessions without this issue. Hopefully this can be fixed with the newest font as well

podiki commented 3 years ago

Thanks for the workaround all! I can confirm it is the freetype version that is causing the problem. It also seems a bit finky as I thought it wasn't working, but doing it both for vrwebhelper and then adding the library to Alyx's folder noted above (without changing the start script) did work. But not the first time, not sure if there is some other element as others have said (like letting SteamVR run a bit). I'll see if I can find out more, but really glad to be able to play again.

(This is on Flatpak, but followed what others did. Presumably I could downgrade the freedestkop runtime or overwrite freetype in there, but haven't looked into that much yet.)

Beanslinger2 commented 2 years ago

I don't understand how to do any of this but I'm still experiencing the error. How do I actually do the workaround?

Wandang commented 2 years ago

@Beanslinger2

  1. DL the old working version of freetype here: https://archive.archlinux.org/packages/path/freetype2-2.10.4-1-x86_64.pkg.tar.zst
  2. Extract the package and find the libfreetype.so inside the objs folder
  3. Copy libfreetype.so.6.17.4 into the steam folder at~/.steam/steam/steamapps/common/Half-Life Alyx/game/bin/linuxsteamrt64
  4. Create symlinks for the other 2 libfreetype files with these commands (you need to change the path/to/original with your path to the objs folder from step 2:
    • ln -s /path/to/original/libfreetype.so ~/.steam/steam/steamapps/common/Half-Life Alyx/game/bin/linuxsteamrt64/libfreetype.so
    • ln -s /path/to/original/libfreetype.so.6 ~/.steam/steam/steamapps/common/Half-Life Alyx/game/bin/linuxsteamrt64/libfreetype.so.6

After this restart steam and try to launch HL as usual.

This should be enough for HL:Alyx to work. If you want your SteamVR settings to work again we need to do step 3 and 4 again for the vrwebhelper folder:

  1. Copy libfreetype.so.6.17.4 into the steam folder at ~/.steam/steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64
  2. Create 2 symlinks:
    • ln -s /path/to/original/libfreetype.so ~/.steam/steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libfreetype.so
    • ln -s /path/to/original/libfreetype.so.6 ~/.steam/steam/steamapps/common/SteamVR/bin/vrwebhelper/linux64/libfreetype.so.6
Beanslinger2 commented 2 years ago

@Wandang Thank you so much!

mitaka8 commented 2 years ago

For me, replacing freetype.so with v2.10 didn't help with HL:A. But I forced HL:A to launch using proton and now the game is running fine for me. image

Downgrading freetype did fix the dashboard not showing up.

pwaller commented 2 years ago

Friendly 2 year ping for Valve (@kisak-valve) - I'm still encountering this, now Ubuntu 20.04 with a Valve index. I've put a fair amount of time into trying to get this to work over the course of years with little success. It once was working but no more. Would love to finish the game some day.

podiki commented 2 years ago

Forcing the use of the Linux runtime compatibility tool works for me.

pwaller commented 2 years ago

Sadly not for me. I don't get the mongoose error, but it does become unresponsive and the shell offers to close the window.

podiki commented 2 years ago

If you'd like to look at a lot of output messages, you could try running Steam with CAPSULE_DEBUG=tool,search LIBGL_DEBUG=verbose G_MESSAGES_DEBUG=1 PRESSURE_VESSEL_VERBOSE=1 STEAM_LINUX_RUNTIME_VERBOSE=1 steam (or variations of those parameters) to get a ton of output of what is happening. Probably a lot, or maybe all, of it is not helpful, but maybe you'll see a useful message.

kisak-valve commented 2 years ago

If Steam / SteamVR is in the process of loading up the left/right hand indicator and battery level to be rendered on the Vive wand models, and Half-Life: Alyx is started, then something seems to go wrong and the battery indicators do not render. This might be connected to Half-Life: Alyx throwing the mongoose error message. This is persistent until SteamVR and the Steam client are restarted.

DomiStyle commented 2 years ago

@kisak-valve Is this related to the icons not loading in the SteamVR menu on some startups? I never checked the battery indicators but the issue with HL always happens when the icons in my overlay are not loaded correctly.

Desktop view will also not work in this state.

Usually a SteamVR and Steam restart solves it as you mentioned, sometimes it takes a few tries.

I have had this issue on Ubuntu 20.04, Ubuntu 22.04, Fedora 35 and Fedora 36 now. I also had the Mongoose mesage on Windows a few times but probably due to some other issue.

Wandang commented 2 years ago

I was unable to play for > 6 month now and tried several times. Today I had an idea.

Whenever SteamVR crashes (and it does quite a lot on Linux) all addons get disabled. I normally dismissed this. But today I realized that the mongoose error comes in conjunction with a gamepad symbol. So my guess is that the gamepad configuration data cannot be loaded.

So I restartet SteamVR, this time reenabled the addon for gamepad support (default one) and restarted SteamVR again. After that I was able to start HL:Alyx immediately and it works (I am in the main menu, which is normally all that is needed)

So guys, please try to reenable gamepad support, restart steamvr and then start HL

TwoD commented 2 years ago

I just started playing yesterday and immediately got this "Mongoose timeout" error. First I just launched with everything at the default state (Steam Beta channel, no forced compat mode, no switches). It started by I was stuck in they grey void rubble and no "Loading..." text. Added -novid -safe_mode start options, which got it to the main in-game menu at the Citadel base. Starting a new game always gave the Mongoose timeout.

Verified both HL:A and Steam VR multiple times, not problems detected

Tried swapping to Freetype2 2.10 but didn't notice any difference.

Forced HL:A to use Proton Experimental. Can no longer launch it directly from Steam but it works from within Steam VR, and the Mongoose timeout doesn't happen every time anymore. If I don't get the "Loading..." text immediately I have to check the desktop for the error. It hangs while plauing every now and then but seems otherwise stable. Level transitions are risky and tend to hang but thankfully it autosaves first. Sometimes it blanks out the HMD completely while the game is still running and I can still see it moving on the desktop as I move the "dead" HMD, but not otherwise interact with it. Sometimes it's just the Mongoose error happening between level transitions.

I guess they use MongoDB for the save files?

Didn't do the gamepad support Wandang mentioned. Don't use a gamepad and I never enabled any addons.

Btw, don't force Steam VR to use Proton Experimental, it can no longer launch. I guess the native version is required there since it ask for root again when reverting back.

Update 2022-09-14: Had another go at the game today. Unable to load my save due to the Mongoose timeout. Restarted serveral tames, same problem each time. Killing vrwebhelper only stopped the VR overlay menu from working so couldn't exit the game from the HMD anymore. Stopped and started SteamVR, then it worked until a freeze at a level transition. Was able to restart and play until a few transitions later when it locked up the entire machine. Rebooted, able to get back in and play past a few transitions until one more turned the HMD completely black, reverted audio back to the main speakers. Again the game was still running and I could move the HMD and see my frozen hands and toggle the desktop UI on/off with Esc. SteamVR said it had encountered an error and needed to restart, so I called it a night.

EliteTK commented 2 years ago

@TwoD mongoose is an embedded HTTP server and IIRC I found good evidence to support that this was what the error was referring to. Have you tried my solution of just leaving steamvr running for a few minutes before starting hlvr?

elwerene commented 2 years ago

@LubosD It helps to disable ipv6 on my side:


sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
LubosD commented 2 years ago

@LubosD It helps to disable ipv6 on my side:

sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1

I use IPv6 very actively, so this is a no go for me.

elwerene commented 2 years ago

@LubosD Same for me, but I don't need ipv6 while playing Alyx, so it's a workaround which helps playing the game. You can reactivate afterwards with the reverse commands:

sysctl -w net.ipv6.conf.all.disable_ipv6=0
sysctl -w net.ipv6.conf.default.disable_ipv6=0

It would be better for valve to just fix the bug :)

marcinsdance commented 1 year ago

I just encountered the same issue. So sad :(

LiamDawe commented 1 year ago

Just confirming disabling ipv6 also fixes it for me. Very annoying.

dsalt commented 1 year ago

Disabling IPv6 is not an option here as I usually have active IPv6 connections.

arcriley commented 1 year ago

C'mon Valve, its almost 2023. IPv6 should be your internal default, this should have been discovered internally and fixed long before release - not something discovered by the public and showing up first as a GitHub ticket.

Please get your act together.

EliteTK commented 1 year ago

While I agree that this is incredibly frustrating. It is extremely clear at this point that this issue is not solely related to IPv6 being enabled and that this is only one of a number of potential causes which seem to vary from setup to setup. I think the bigger issue here is the lack of proper error reporting which could be used to further diagnose the issue.-- Tomasz Kramkowski