Yellow-Dog-Man / Resonite-Issues

Issue repository for Resonite.
https://resonite.com
141 stars 2 forks source link

Resonite closes after varying timeframe in VR or Desktop Mode with VR connected #3064

Open CraftingCreator opened 1 month ago

CraftingCreator commented 1 month ago

Describe the bug?

Resonite closes unexpectedly after a variable Time-frame, if Monado 0.1.0-56d500a (installed via Envision 0.1.0-56d500a) is running and is connected to Resonite. this has happened between 1-15 minutes after the world has seemingly fully loaded. Logs cut off mid-word usually

To Reproduce

Install Resonite via Steam Turn on Steam Compatibility Mode (Proton-GE, Experimental and 9.X acted similarly) Start Envision and Monado Start Resonite with the first Selection Let Resonite load Wait/Press F8 and Wait

Expected behavior

Use Resonite with Monado without Crashes or Unexpected behavior due to the choice of Hard- and Software in use.

Screenshots

No response

Resonite Version Number

Beta 2024.10.8.1349

What Platforms does this occur on?

Linux

What headset if any do you use?

Valve Index, Monado+Envisionsteam

Log Files

CRAFTING-MS7C56 - 2024.10.8.1349 - 2024-10-09 04_48_33.log

Additional Context

No response

Reporters

Stimmchen, Crafting

Frooxius commented 1 month ago

What leads you to believe this is a Resonite bug and not something with Monado/Proton/Envisionsteam?

GrayBoltWolf commented 1 month ago

What leads you to believe this is a Resonite bug and not something with Monado/Proton/Envisionsteam?

Because there are other issues reporting the same thing for a user on SteamVR #2838 I can sit in VRChat for 6 hours - no issue. Resonite? I'll get 9 crashes in 2 hours. I use Monado here as well, same issue as OP. Monado is still running, no issues at all with it, all I have to do is click "Play" on steam to start it back up again.

Image

Frooxius commented 1 month ago

If this is same issue, we should probably merge them.

However that alone isn't enough to indicate that this can be issue on Resonite side - we do not have any specific integration or interactions with these runtimes in particular. Resonite could just be triggering a bug in Monado/Proton/Envisionsteam that VRChat doesn't - but that still doesn't mean that the bug is on our side.

E.g. consider Proton itself - there's a number of games that crash with it, because it doesn't support them yet. That doesn't mean the issue on the side of the game, just because other games work flawlessly.

I honestly don't even know where to start investigating this, the logs just end abruptly. Are there any Player.log files? Any information about how exactly this crashes? Anything from Monado/Proton/Envisionsteam?

Right now I have no leads,. so any help here would be appreciated.

GrayBoltWolf commented 1 month ago

I have attached my logs in #2838, which includes player.log https://github.com/Yellow-Dog-Man/Resonite-Issues/issues/2838#issuecomment-2339497048

Unfortunately Resonite crashes so hard that all logs terminate with no useful information - not even a crash log. Monado logs simply show the client disconnected from the OpenXR runtime. The comparison to VRChat seems somewhat appropriate as it is also a Unity renderer, and moreso on top of that even has anticheat which works flawlessly. I know it's an unflattering comparison but it's the best I've got for what I would say is the most similar title.

The Linux build of Resonite was cancelled, and users indicated to run it through Proton and report bugs, which is where we are now.

OrionMoonclaw commented 1 month ago

I'll add that from the side of opencomposite, it looks like Resonite just quits, sadly it is indeed the case that both Player.log and the regular log end abruptly, there are also no coredumps collected by systemd

I've been trying to collect some more information, including from proton, but sadly the game decided to be annoying and not crash for days, despite spending hours in public sessions

Frooxius commented 1 month ago

I checked the logs, but I don't see anything obvious in those either. They do just end abruptly, which makes this difficult to diagnose.

I don't think it's appropriate comparison - the Unity versions differ and the code and libraries we use differ vastly too - there's a lot of stuff that's drastically different between the two and any of it can be potentially be causing incompatibility/issue in the compatibility layers and triggering the crash.

In order to solve this, we need to isolate where the issue is occurring and right now it could be in any of these components.

Given that you're running the Windows version, which doesn't experience the same crashes I do not think that it's something on Resonite's side - otherwise we'd be experiencing same crashes on Windows also.

Likely it's some interaction of some component between Resonite and one of these systems - but I have no idea which.

We'd need more information and see if there's any logs/captures that can be gotten from the crashes or if there's some way to narrow down the issue.

GrayBoltWolf commented 1 month ago

Well, there are a lot of other crash threads on Windows as well #2890 #2845

Is there any sort of additional debugging that can be turned on in Resonite? Like a launch command or a beta build with more verbose logging enabled? I'm happy to run whatever to get you more info.

Frooxius commented 1 month ago

@GrayBoltWolf What makes you think those are the same crashes with same underlying issues?

They don't seem to be following the same kinds of patterns as these. If there's common roots that can help, but just because there are other crashes doesn't mean they're related.

GrayBoltWolf commented 1 month ago

@GrayBoltWolf What makes you think those are the same crashes with same underlying issues?

None of them are resolved so how would we even know if they're not related? Obviously speculation on my part but I think the statement "Given that you're running the Windows version, which doesn't experience the same crashes" is somewhat untrue, the Windows build running natively absolutely does crash sometimes, and there are reports stating so.

Frooxius commented 1 month ago

@GrayBoltWolf Unless we have indication that they are related, we have no reason to assume they are related - we shouldn't be jumping to that conclusion.

If you have information on that - that can help, because we can isolate information and narrow down the problem, but speculation won't help there.

I am not saying that crashes on Windows do not exist at all, just that I haven't seen ones that would follow similar pattern to these.

GrayBoltWolf commented 1 month ago

How can we get you more data @Frooxius ? Is there any more verbose logging that can be enabled on Resonite?

Frooxius commented 1 month ago

Also Resonite itself has verbose startup logging command, but that wouldn't help here. I do not think there's anything we could add to help with more verbose logging - the logs already do end up abruptly, so the code to log the crash logging code wouldn't likely run at all - typically the logging from Resonite only works during "clean" (managed) crashes - the hard ones are typically handled by Unity's crash handler, but even that doesn't seem to be invoking here.

I'm not super familiar with Monado, Envison and Proton and their internals. If they have logging & diagnostic tools, those could help.

rayojarr commented 1 month ago

I have also been trying the "Lognt" mod and so far I haven't crashed in 2hrs. So there "could" be something regarding logs causing crashing.

OrionMoonclaw commented 1 month ago

I have also been trying the "Lognt" mod and so far I haven't crashed in 2hrs. So there "could" be something regarding logs causing crashing.

There might be something to it, I've been running with LogAutoFlush while trying to unsuccessfully reproduce this, I think if I can't get it to crash by the end of the week I'll remove the mod and if that makes it crash it's probably that

Possible something is influenced by timing here

Frooxius commented 1 month ago

It's possible, but also based on the comment by @OrionMoonclaw above, it's not crashing for days. Do they also use the same mod?

If they have the logs on, then it's unlikely that they are the cause of the crash.

Any patterns or relations could help isolate things and find areas to focus on, but we also need to be careful and make sure they are actually real patterns and not just random.

rayojarr commented 1 month ago

The problem is it's not easy to reproduce it, since it closes at random. And @OrionMoonclaw is using LogAutoFlush while I'm using lognt

Frooxius commented 1 month ago

Yeah, that makes these problems very difficult to diagnose unfortunately. We need some kind of clue or lead.

rayojarr commented 1 month ago

I just checked, the mod's default config allows to write logs. So The mod doesn't do anything by default at least

rayojarr commented 1 month ago

Just crashed so the mod didn't seem to help. Same as before, logs just get interrupted.

CraftingCreator commented 1 month ago

I don't think it's a monado or envision issue, as they themselves do not crash, and Resonite just stops working. I am making an assumption here, that Resonite doesn't handle something that is given by envision/monado/steamvr well. As we were directed to use the Proton build, and report issues we find, I believe that this is at least something to keep an eye on for information with other crashes, as while it is difficult to diagnose, the fact that Resonite itself just stops like taskmanager killed it seems to indicate that something is inherintly unstable/incompatable with the current Linux/proton setup. And since we were directed to use that setup, and this seems to be something that happens more with Resonite than other applications, I believe that it is worth considering there to be an incompatibility to keep a watch for. I have actually seen one other issue like it recently, minecraft with the vivecraft mod would just stop when trying to using openVR (the logs stop to, like here), but would work when setup to use openXR (a friend modified it to do so). Might be a possible starting point?

I'm no coder, or professional tester, just a linux user that is desperate to play the only game that does what Resonite does.

Frooxius commented 1 month ago

I don't think that necessarily means it cannot be a thing with them. It could similarly be that Envision/Monado trip up something wrong that ends up killing the process. We don't really know at this point and I don't want to eliminate any possibilities.

The main thing here is that we don't really have any integration with Envision/Monado specifically, so I don't even know where I'd begin to investigate issue on our end - we don't have any code that's interacting with these components specifically.

Typically Resonite would run a crash handler if it crashes on its end, but when it just ends?

I'm not saying that there aren't things we couldn't do to fix this - either by changing some API/pattern we're using on our end to not trip an issue or reporting a bug to corresponding project (Proton/Envison/Monado...).

But first, we have to narrow down what it actually is somehow.

My main thinking is that Resonite itself is mostly C# code, which tends to not hard crash by itself, since it's managed environment. Typically the hard crashes tend to be in the native code that can end up corrupting memory or running invalid instructions - some of that is SteamVR SDK, some of that are native libraries we use (e.g. FreeImage, Opus, msfgen...) some of that is Unity's native code.

Since this seems to only happen when VR is active/connected (at least from my current understanding of this issue), my suspicion is that it's something with SteamVR SDK and/or the other side of that (whatever is responsible for running VR) - since that's the component that is not active while using just the desktop.

I did update the SteamVR SDK on latest prerelease, so maybe that will help?

coolymike commented 1 month ago

Quick summary of programs:


When using an OpenVR application under this stack, it only interacts with Proton, Monado/WiVRN, and OpenComposite. For OpenXR, it's the same but without OpenComposite. Resonite (being legacy Unity) uses OpenVR, and interacts with both OpenComposite and through that, the OpenXR runtime (Monado/WiVRN).

We don't really know at this point and I don't want to eliminate any possibilities.

At this point we don't know for sure, as soon as I have time I will look into trying to reproduce this on my setup. However, in Linux/FOSS VR communities, Resonite is a notorious application that does things slightly differently than everything else, relying on a ton of quirks from SteamVR to function properly. For example, hands are still not completely accurate in Resonite under Monado/OpenComposite, despite working fine in many other games.

The main thing here is that we don't really have any integration with Envision/Monado specifically

The OpenVR component of Resonite ("SteamVR SDK") is responsible for communicating with the OpenVR runtime. Although Valve decided to call it the "SteamVR SDK", the OpenVR component is not enforcing SteamVR as a runtime, and can be used under OpenComposite+Monado. This works successfully under many other games.

I did update the SteamVR SDK on latest prerelease, so maybe that will help?

As soon as I have time, I will try to check what impact that update had for Monado/OpenComposite.


Due to the lack of logging capability from Resonite itself with any impactful crash (like these), an external debugger is often required. I personally use the Linux native build, which on some builds runs into a Mono GC crash very frequently (which behaves extremely similar to this issue), which seems to be a "will this build crash roulette", and works just fine on other updates with seemingly no changes.

This is also where debug symbols (iirc, .pdb files on Windows) come in really useful. Instead of just seeing a memory address and assembly instructions, debug symbols allow for troubleshooting using function names.

OrionMoonclaw commented 1 month ago

However, in Linux/FOSS VR communities, Resonite is a notorious application that does things slightly differently than everything else, relying on a ton of quirks from SteamVR to function properly. For example, hands are still not completely accurate in Resonite under Monado/OpenComposite, despite working fine in many other games.

This isn't really true, a lot of games depend on the hand skeleton, including VRC, HL:A, and The Lab. The only thing there that makes Resonite special is the use of model space (only other game I'm aware of is HL:A), but we have all the parts in OC working correctly now

GrayBoltWolf commented 1 month ago

I'm going to run proton debugging tonight/tomorrow and see if there's anything useful in the logs.

But yeah just to be clear, I'm also not using SteamVR either - as far as I know it's not involved with this OpenXR runtime at all other than the lighthouse tracking database. If Resonite expects SteamVR to be present, could be an issue.

Frooxius commented 1 month ago

@coolymike Thank you for the detailed breakdown, this helps quite a bit!

Resonite is a notorious application that does things slightly differently than everything else, relying on a ton of quirks from SteamVR to function properly. For example, hands are still not completely accurate in Resonite under Monado/OpenComposite, despite working fine in many other games.

This part is really odd to me. Like for the SteamVR integration itself, we are just using the standard SDK as it comes from Valve. I don't know what would be different on that?

Like I can't think of any quirks that we'd be relying on. Can you be more specific on those?

For the hands - I don't think that has anything to do with SteamVR - that's just likely how they're setup and translated to an avatar. I'd need details here, I don't know what it does on Linux, but we are just copying and translating the skeletal data. I don't think there's anything particularly unusual about that. In either case, I don't see this part of the code causing any crashes, it's just a bunch of math, but it doesn't do anything different with the native API itself.

But yeah just to be clear, I'm also not using SteamVR either - as far as I know it's not involved with this OpenXR runtime at all other than the lighthouse tracking database. If Resonite expects SteamVR to be present, could be an issue.

Are you running it in desktop or VR?

My understanding was that you use Monado in place of SteamVR. Resonite itself uses the SteamVR SDK (pretty much vanilla), so I'd expect that SDK to be interacting with Monado. That could be a potential source of the crashes too due to some incompatibility/issue between the two.

We don't expect SteamVR explicitly itself - that part is handled by Unity & SteamVR SDK, so it's however these components interact with Monado.

GrandtheUK commented 1 month ago

This part is really odd to me. Like for the SteamVR integration itself, we are just using the standard SDK as it comes from Valve. I don't know what would be different on that?

@Frooxius Orion corrected/added more detail on this in their previous comment

Frooxius commented 1 month ago

@GrandtheUK @OrionMoonclaw I saw that, but I wasn't sure if that's what @coolymike meant.

I'm not fully sure what they mean by "model space" though in this context?

coolymike commented 1 month ago

Like I can't think of any quirks that we'd be relying on. Can you be more specific on those?

Unrelated to this issue, but a notable one I have seen is mirrors (or any reflection/refraction) on the Valve Index under Monado. It appears to me the textures are swapped between the eyes (As in, the image on the mirror that should be displaying in the left eye is displaying in the right eye). I don't recall the exact technical details on this issue.

but we are just copying and translating the skeletal data. I don't think there's anything particularly unusual about that.

I am not a developer for any of the projects (OC or Monado), but specifically Resonite's implementation of copying skeletal data has caused some issues for those projects. From what I can tell (loosely following LVRA chat threads) it is close to being solved in those projects. Since I'm not deep into those projects, I don't know the exact cause of the issue, but the result has been that hands are offset completely wrong.

I saw that, but I wasn't sure if that's what @coolymike meant.

Yes, that is the issues I'm referring to, I have not updated OpenComposite in a bit though, so I haven't tested if it was already fixed.


I made that comment to indicate that Resonite sometimes doesn't behave like the standard for other VR games. And that a runtime other than SteamVR, although it theoretically should work using the OpenVR protocol, could result in issues not present in other games, including crashes.

Frooxius commented 1 month ago

Unrelated to this issue, but a notable one I have seen is mirrors (or any reflection/refraction) on the Valve Index under Monado. It appears to me the textures are swapped between the eyes (As in, the image on the mirror that should be displaying in the left eye is displaying in the right eye). I don't recall the exact technical details on this issue.

That sounds like it might be feeding it some wrong/different matrices or other parameters than SteamVR, which is messing things down the line. Though I don't think that's really any quirk of SteamVR - the mirrors don't really have much to do with SteamVR itself. It might be worth a separate report if there's not one, so we can look on what is happening.

I am not a developer for any of the projects (OC or Monado), but specifically Resonite's implementation of copying skeletal data has caused some issues for those projects. From what I can tell (loosely following LVRA chat threads) it is close to being solved in those projects. Since I'm not deep into those projects, I don't know the exact cause of the issue, but the result has been that hands are offset completely wrong.

I think the main issue is that most games opted for the simpler hand tracking parameters for their systems, while we use the full skeletal model to get all of the data. For a good while, we were one of the few who would actually do that, so I think most systems didn't really implement that full skeletal model properly.

We're not really doing anything super special here though - we are just copying the data we get from the SteamVR itself as we get it.

In any case I don't think this would be related to these crashes.


My guess is it would have anything to do with the more "native" bits, like stuff related to core VR rendering itself and such. We need a proper lead to figure that out though.

OrionMoonclaw commented 1 month ago

I am not a developer for any of the projects (OC or Monado), but specifically Resonite's implementation of copying skeletal data has caused some issues for those projects. From what I can tell (loosely following LVRA chat threads) it is close to being solved in those projects. Since I'm not deep into those projects, I don't know the exact cause of the issue, but the result has been that hands are offset completely wrong.

We just had to properly implement model space skeletal data, initially it was just predefined like parent space data, but this caused some math issues on our side, since model space can't be interpolated as easily, so we just switched to doing everything in parent space and converting when the game requests model space. This actually caused some issues in HL:A because the math was wrong, but Resonite itself wasn't affected due to not relying on bone positions, it only needs orientations (besides the wrist)

I know the knuckles controllers have some issues with positioning, this is likely somewhat on Monado's side, we're gonna have to overhaul how the data is generated there, since that's what in the end gets translated in OpenComposite, and that logic appears to work very well for actual hand tracking on the Quest. Either way if this were to cause a crash it would be obvious, I've seen math-related world crashes while working on estimated skeletal data and they show up in the log file as you would expect.

The other graphical issue could be related to the native build or your system, it's probably not relevant here unless it contributes to the crashes somehow.

OrionMoonclaw commented 1 month ago

I took a look at this merge request that was closed on OC https://gitlab.com/znixian/OpenOVR/-/merge_requests/125 and it looks like it could do something.

I rebased it and added some more locking to be sure, if anyone experiencing these crashes frequently wants to test and report back just set your opencomposite repo in envision to https://gitlab.com/OrionMoonclaw/OpenOVR.git and branch to cached-views-mutex

Remember to set it back at some point though

Frooxius commented 1 month ago

@OrionMoonclaw Oh that's excellent! Thank you so much for looking into this. I had no idea they already had reports on their end to this.

I'm curious if this will help with the crashes.

rayojarr commented 1 month ago

I took a look at this merge request that was closed on OC https://gitlab.com/znixian/OpenOVR/-/merge_requests/125 and it looks like it could do something.

I rebased it and added some more locking to be sure, if anyone experiencing these crashes frequently wants to test and report back just set your opencomposite repo in envision to https://gitlab.com/OrionMoonclaw/OpenOVR.git and branch to cached-views-mutex

Remember to set it back at some point though

I am adding it rn to give it a go. Will let you all know the results. :P

rayojarr commented 1 month ago

So far I've ran it with resonite for a little more than an hour. No crashes yet but will do more testing tomorrow.

OrionMoonclaw commented 1 month ago

Went for about 3 hours today with no issues, though that's hardly unusual for me, we definitely need a good sample size over a week or so

rayojarr commented 1 month ago

Just now crashed with the new opencomposite fork. (Game closes) opencomposite.log WOWFPC - 2024.10.8.1349 - 2024-10-12 05_30_11.log Player.log

OrionMoonclaw commented 1 month ago

Was a shot in the dark, but at least we know it's not that

Try running with PROTON_LOG=1 %command% next time, maybe there will be something interesting in there

rayojarr commented 1 month ago

Was a shot in the dark, but at least we know it's not that

Try running with PROTON_LOG=1 %command% next time, maybe there will be something interesting in there

Running it rn, it's already more than 3GiB

rayojarr commented 1 month ago

And now resonite froze..

rayojarr commented 1 month ago

Had to force close it, logs are here (Though the proton log file is 5GiB so I can't upload it) WOWFPC - 2024.10.8.1349 - 2024-10-12 12_35_21.log Player.log And zipping it takes forever

shiftyscales commented 1 month ago

Is there anything that seems meaningful in the proton log, @rayojarr? Text compresses very efficiently if you were to zip the log file and upload that here.

rayojarr commented 1 month ago

Is there anything that seems meaningful in the proton log, @rayojarr? Text compresses very efficiently if you were to zip the log file and upload that here.

Sorry for late reply, am now compressing it and will see if I can upload it when it's done.

rayojarr commented 1 month ago

It failed to upload, the zip may be to large. Even though I did max compression. I do however have a google drive I can upload it too.

rayojarr commented 1 month ago

Or if you don't want do deal with google we can go through P2P connection through Blaze: https://blaze.vercel.app/app/t/proton-logs

rayojarr commented 1 month ago

But for google drive here's the link in advance since it's getting late: https://drive.google.com/file/d/1MTwFA_JBX13ljuR8oB09SNwylfSqQrzt/view?usp=sharing

GrandtheUK commented 1 month ago

But for google drive here's the link in advance since it's getting late: https://drive.google.com/file/d/1MTwFA_JBX13ljuR8oB09SNwylfSqQrzt/view?usp=sharing

@rayojarr I just looked over it myself, there's lots here in the proton log that probably isn't relevant to resonite debugging. if it's getting to 5gb of logs then we need to turn some bits of logging off and turn some additional bits on. I would need to find exactly what needs turning on though, would likely be vr related logging.


Also i just had a crash of my own though the log file doesn't seem to be interrupted this time though i am running a few mods (including the log autoflush mod) so i don't know how useful the logs will be. I was using and testing the OpenComposite branch that @OrionMoonclaw previously mentioned here though i had some instability in the middle of my play session quite frequently though i'm uncertain if that is a Monado+OC issue or a resonite issue but i will likely figure out if that needs to be its own issue here at a later date. but my logs are below if they need to be looked at.

LINUX-DESKTOP - 2024.10.8.1349 - 2024-10-13 01_10_32.log

Player.log

OrionMoonclaw commented 1 month ago

This should declutter the logs a lot (and also grab DXVK logs), let's see if it shows anything

PROTON_CRASH_REPORT_DIR=~/logs/ PROTON_LOG_DIR=~/logs/ PROTON_LOG=-seh,+warn,+err DXVK_HUD=full DXVK_LOG_PATH=~/logs/ DXVK_LOG_LEVEL=debug %command%

if not, adding +vrclient would show what OpenVR calls are being made at the time of the crash, but that makes the log really big, so last resort

OrionMoonclaw commented 1 month ago

I finally crashed today, and I was able to get a minidump file out of Steam, and it looks like it might be completely unrelated to VR after all, unless it's yet another issue

If anyone else gets a crash definitely try grabbing that dump from /tmp/dumps/ (I think that should be the default)

crash.zip

OrionMoonclaw commented 1 month ago

It looks like we can also rip out steam_api64.dll without too many issues, so that might be useful for testing

rayojarr commented 1 month ago

Just crashed as well. Here are the logs crash.zip