Closed ikanimew closed 1 month ago
The original issue edited out the following information that is rather important:
Fatal error. System.AccessViolationException: Attempted to read or write protected memory.
This is often an indication that other memory is corrupt.
Source: https://discord.com/channels/1040316820650991766/1154514015721099294/1262617355905794100
I cannot find it in the logs since it is a fatal error, so I will include it in this comment.
This looks like an issue with Convex Hull, which is related to #1908 (not a duplicate) and a regression of #1198.
I also believe this is missing the prerelease
label since this is the .NET 8 headless.
1) Does this happen with particular world? Or can you replicate it on a gridspace too? 2) If so, which world does this happen with?
What is odd here is that you state that this happens with collider event, but the log doesn't show that - instead it seems to be when a convex hull is being computed (which is not a collider event). So this might be specific to particular collider or collider being modified.
Okay, so I did a bunch of testing and was able to narrow it down to a cone collider on my avatar that would cause the crash. I could pretty reliably get the headless to start timing out, then crash if I spawned that avatar out. I then modified the avatar to remove that collider from it, and the crashes stopped. This was confirmed in both my regular world and a grid world, as well as with a fresh install of the headless (moving the install directory out of the way and re-downloading, so everything but the config file was fresh).
Here's the collider in question.
If desired, I can share a copy of that avatar with staff to examine. I believe this is the cause, though.
@ikanimew I see that the ConeCollider Radius is being driven. Do you know what range of values it goes through?
Doing a quick check, it looks like the driver is a ValueGradientDriver that goes from 0.04 to 0.015 and is linked to the state of a blendshape. The position and rotation drives are similar.
Hmm... I checked the code and it seems they are clamped. I couldn't replicate the issue even with nonsensical values (like NaN, Infinity, negative and so on).
Are you able to make a replication item that's able to reproduce this issue?
This issue only occurs on Linux headless with .NET 8. I have tested with @ikanimew, and no issues were observed on Windows build and release.. The crash happens with a specific cone collider on a stinger, which is part of an avatar's tail.
On the Linux headless server, as soon as a user joins the session, the server begins giving "Engine unresponsive" messages, followed by the fatal error: System.AccessViolationException: Attempted to read or write protected memory.
@ikanimew placed the stinger object on a cube, and spawning it in a Linux headless session consistently caused the crash. The cone collider's radius is driven by a ValueGradientDriver ranging from 0.04 to 0.015.
This issue is isolated to the Linux build of the .NET 8 headless. Sometimes, the issue can take up to a minute to occur. assuming this issue is related to the convex hull calculation.
Is this issue exclusive to running the Linux headless client then, @ikanimew @Xlinka? Can you also not replicate it when running the headless under Windows?
If so, would you be able to cooperate with other users to test the Linux headless client with other systems and see if it still occurs there too?
Would you also mind providing the replication object onto this issue for ease of access in additional tests as Frooxius requested above?
I have tested loading Ikani on two of my headless servers running Windows 11 (10.0.22621.3880) Earlier today, and both had no issues with loading and did not crash. I'm currently asking @ikanimew for the replication object for additional tests awaiting a response.
The headless Linux that was crashed belongs to @ikanimew and was able to crash in their Hive world and a normal gridspace. to remove the suspect of a item in the world.
Ikani notes that "So I was able to reproduce by just making an empty slot, adding a cone collider, setting the height to 0.05 and driving the radius with a value gradient driver set to 0.04 for 0 and 0.015 for 1, and the progress at 0."
i shall carry a test out for this on my own linux headless.
Interesting. Does it require those specific values in the ValueGradientDriver, or does the issue occur with any arbitrary values so long as the radius is being driven?
Xlinka attempted the test on one of TheRoxDen Debian 12 servers running Beta 2024.7.17.1173
Could not reproduce TRD-42400 - 2024.7.17.1173 - 2024-07-19 01_42_46.log
console.log Alright, so to rule out my existing Debian 12 host, I spun up a new Debian 12 VM, ran through a default OS install (no gui, standard packages, ssh server), then installed dotnet8 and steamcmd and launched a grid world. I was able to join the grid and saw no stability issues. I then created an Empty Slot, and added a cone collider to the slot. As soon as I did this, the console (like 936) started reporting "Engine unresponsive" for just over a minute. At the end of that (line 1002) the collider component appeared in the inspector. I gave it a moment, then set the collider to Trigger, and the Unresponsive messages continued until the crash message listed.
This log is the full console log, to show the install process as well. I loosely based my install steps from the Dockerfiles in https://github.com/voxelbonecloud/debian-dotnet and https://github.com/voxelbonecloud/headless-docker though this is NOT a docker based deployment.
And for clarification, I do not have the original collider on my avatar any longer, so I believe that's not what's triggering the hangs and crashes
To add to this, I spun up a new VM in DigitalOcean, followed the same steps as above, the the issue does NOT happen there. I'm going to do a bit more testing here but this just keeps getting weirder
Thank you @ikanimew. Hopefully you are able to better isolate the source of the issue.
So, doing more tests, this is definitely limited to the specific hardware of my datacenter server. I've been unable to reproduce the error anywhere else, including on an identical server at home. Going to close this for now and look into hardware replacement.
Describe the bug?
I am running into an issue where if I run my headless under mono it works fine, but under dotnet8, as soon as a user joins the session, then interacts with a collider, the headless crashes with variations on the following error:
To Reproduce
headless install on Debian 12 system, running from an existing mono install. Using the same config file, launch from dotnet8 Join the hosted session. wait without moving around for the world to finish loading Interact with a collider headless will begin giving "Engine unresponsive" messages, then the fatal error.
Expected behavior
session should remain responsive with collider events
Screenshots
No response
Resonite Version Number
2024.7.15.1359
What Platforms does this occur on?
Linux
What headset if any do you use?
Vive Pro Eye
Log Files
DISPLACER - 2024.7.15.1359 - 2024-07-15 20_08_12.log (client log) resonite - 2024.7.15.1359 - 2024-07-15 20_25_26.log (headless log) stdout.log (headless stdout log)
Additional Context
I'm unsure if the issue is code, something with my world, or something with the install. I'm happy to swap those around to test that further but I'd like to see if the current logs can provide some insight first.
Reporters
Ikani