Ylianst / MeshCentral

A complete web-based remote monitoring and management web site. Once setup you can install agents and perform remote desktop session to devices on the local network or over the Internet.
https://meshcentral.com
Apache License 2.0
4.1k stars 552 forks source link

Mesh causing BSOD on server #1597

Open CCWTech opened 4 years ago

CCWTech commented 4 years ago

We have had a high correlation of connecting using mesh and one of our servers crashing. It doesn't happen all the time but seems to happen consistently when connecting to the server using Mesh. What logs can I provide to troubleshoot this?

We use teamviewer or screenconnect and don't have the same issue. Running Server 2016 Standard.

2020-07-10_12-34

Ylianst commented 4 years ago

Are you saying the MeshCentral server is crashing? Or the agent on a remote machine is causing the remote machines to crash? If it's the server, what version of NodeJS are you using?

CCWTech commented 4 years ago

Mesh Central server is fine, it's a remote machine that is crashing. Sorry, I should have been more clear.

Ylianst commented 4 years ago

Ok, so "happen consistently when connecting to the server using Mesh". What you mean by that is that is happens when you connect Remote Desktop? Or do something else?

CCWTech commented 4 years ago

Yes, when on Desktop and you click "Connect"

Ylianst commented 4 years ago

Thanks. Bryan will have to give that a try on a Microsoft Server 2016 Standard.

CCWTech commented 4 years ago

We have a number of Server 2016 Std. installs. It seems to be just this machine for some reason.

PathfinderNetworks commented 4 years ago

I've had that happen randomly with Server 2016 boxes as well. Seems to be less than it used to be. Now it's rare (thankfully!!) that I see it.

CCWTech commented 4 years ago

I've had that happen randomly with Server 2016 boxes as well. Seems to be less than it used to be. Now it's rare (thankfully!!) that I see it.

Thanks for the information, that's very helpful!

PathfinderNetworks commented 4 years ago

Sadly, I haven't been able to track down why it happens though. It doesn't seem to happen with Server 2019 from what I can tell. And not all 2016 boxes. If anything it seems to happen more with Lenovo ThinkServers running Server 2016- but that might just be anecdotal because I have more of those out there than anything else.

CCWTech commented 4 years ago

Mine is also a Lenovo ThinkServer.

Ylianst commented 4 years ago

In general, it should not be possible to crash any operating system with an application from user space (Ring3). Generally, this happens because of bad drivers, bad hardware or OS kernel bug. Bryan does testing of the MeshAgent in virtual machines for many different operating systems, but it's unlikely he can replicate this since he will not be running the same drivers/hardware. I will look up the stop code and see what I can find.

ninanoe commented 4 years ago

I have the same issue , Meshcentral Agent crashing with BSOD win32kbase.sys

This happens on Server 2016 (latest patches etc) and also on Server 2019 (latest patches).

And i'm running MeshCentral 0.5.85

CCWTech commented 4 years ago

I have the same issue , Meshcentral Agent crashing with BSOD win32kbase.sys

This happens on Server 2016 (latest patches etc) and also on Server 2019 (latest patches).

And i'm running MeshCentral 0.5.85

Is that on Lenovo Thinkserver by any chance?

ninanoe commented 4 years ago

I have the same issue , Meshcentral Agent crashing with BSOD win32kbase.sys This happens on Server 2016 (latest patches etc) and also on Server 2019 (latest patches). And i'm running MeshCentral 0.5.85

Is that on Lenovo Thinkserver by any chance?

No , i'm running HP DL380 G7 (2016) and 2019 server on a Gigabyte Motherboard i5

CCWTech commented 4 years ago

Hi, is there any progress on this?

krayon007 commented 4 years ago

I haven't been able to reproduce it yet... I'll look thru what laptops I have in the lab, to see if I can grab a few to test to see if I can get it to BSOD.

CCWTech commented 4 years ago

Thanks.

b8two commented 3 years ago

Hi All,

We have been observing this with Server 2016 also.

When a Machine is connected after first boot with Mesh, Ctrl+Alt_Del to log it, it works normally. If the Machine is accessed via Remote Desktop, it works normally. If the machine is then access via Mesh, The authentication screen is shown but once authenticated, BSOD. ^Can also occur if accessing via Mesh router & RDP.

Hence this appears to be related to the User Switching Service.

This occurs in Azure also, hence it is not a hardware related issue.

If it helps, issue was seen on version 0.5.0-s, upgraded to version 0.6.62 and still occurs.

CCWTech commented 3 years ago

Updates on this? Mesh is still crashing servers.

Server 2016 Today.

bcallar commented 2 years ago

Same issue today, on virtual and physical servers, both are using Windows 2016 Std. Any news about this issue ?

krayon007 commented 2 years ago

Same issue today, on virtual and physical servers, both are using Windows 2016 Std. Any news about this issue ?

Are your physical machines Lenovo?

bcallar commented 2 years ago

Hi, Yes it is, Lenovo ThinkSystem ST550 (Model N° CTO1WW) Hope it can help you, we switched to another remote control solution for the moment, but Mesh is a very good solution we would keep. Regards Ben

michaelsage commented 1 year ago

I'm having this issue with a server 2016 VM (running on ESXi). It only happens the first time I connect, when I reconnect after the reboot it appears to work. Did anyone ever manage to find a solution?

myde2001 commented 1 year ago

This still occurs, i'm having this issue in a server 2016 VM (running on ESXi).

si458 commented 1 year ago

@myde2001 check #4759 it appears its be crashing when you log out from the server randomly, if you use rdp the is no issues

myde2001 commented 1 year ago

@si458 can confirm, it happened when I logged out

dinger1986 commented 1 year ago

Try and remove VMware tools and see if it still does it

si458 commented 8 months ago

keeping this open, it seems Server 2016 and Server 2019, when you use the normal 'Connect' button then login, do some work, then 'Logout'/'Sign Out', the server then crashes, HARD the work around for the moment is to NOT 'Sign Out' when you are finished but just simply use the 'Disconnect' button needs investigating

silversword411 commented 8 months ago

Server 2016 has lots of weird quirks...

joeldeteves commented 3 months ago

This issue appears to be happening for us in one environment as well (Also VMWare). Patching VMWare Tools did not fix it.

Thank you,

dinger1986 commented 3 months ago

Interestingly enough happened to me during the week, need to look at it again. Don't even know if VMware tools is installed on that server

si458 commented 2 months ago

what OS are people using to cause this crash? i was gunna try have a look at it, but i cant seem to replicate this bug anymore with server 2019 fuly patched upto today?

joeldeteves commented 2 months ago

what OS are people using to cause this crash?

It consistently happens with Server 2016 running on VMWare ESXI.

Other versions eg 2019 seem fine from my testing.

si458 commented 2 months ago

thanks @joeldeteves will create a 2016 vm now and test (i use proxmox so not sure if it makes a difference or not)

myde2001 commented 2 months ago

thanks @joeldeteves will create a 2016 vm now and test (i use proxmox so not sure if it makes a difference or not)

I think this issue has something to do with VMware drivers (VMware Tools)

si458 commented 2 months ago

@myde2001 no thats incorrect because i had a PHYSICAL server 2016 machine and it crashes when i do the connect, login, use, then sign out, but its a production machine so i cant use it again for testing so im just gunna create a vm for testing tho in the mean time

myde2001 commented 2 months ago

@myde2001 no thats incorrect because i had a PHYSICAL server 2016 machine and it crashes when i do the connect, login, use, then sign out, but its a production machine so i cant use it again for testing so im just gunna create a vm for testing tho in the mean time

Okay, great, let's hope we can find the root cause then

si458 commented 2 months ago

im guessing its something to do with the way we look for logged in users, and when you sign out, maybe the lookup of users is broken on server 2016, so it just crashes? but gunna be hard to debug with visual studio when you need to logout at the same time to cause the crash haha guess print a lot of messages to text files to find what line its crashing before/after i guess haha

CCWTech commented 2 months ago

thanks @joeldeteves will create a 2016 vm now and test (i use proxmox so not sure if it makes a difference or not)

I think this issue has something to do with VMware drivers (VMware Tools)

Probably not. We don't use VMWare at all.

aelfwine88 commented 1 month ago

Happens here as well with multiple Win2016Std (physical machines) and with MeshCentral v1.1.27 behind a reverse proxy.

b8two commented 3 weeks ago

This bug occures with Win2016Std (physical machines) and Hyper-V VMs.

It often occurs when you switch between Mesh Desktop access with direct RDP sessions. (i.e. if you boot an never use a RDP session, then it is okay, if you use RDP and after use Mesh Desktop access, often I have observed a BSOD.)

But Since Server 2016 will be EOL in a year and a bit, might not be an important bug to squash.