Ylianst / MeshCentral

A complete web-based remote monitoring and management web site. Once setup you can install agents and perform remote desktop session to devices on the local network or over the Internet.
https://meshcentral.com
Apache License 2.0
4.15k stars 557 forks source link

Machines appearing offline on meshcentral console when machine is on #556

Open geanferrani123 opened 5 years ago

geanferrani123 commented 5 years ago

Guys, good morning!

I am noticing that in some cases, machines appear that have an agent installed but go offline in meshcentral, to resolve this I am having to reinstall the agent, to take some action so that this does not occur? Does the agent always come online from the moment the machine is turned on?

wvt-adm commented 5 years ago

this problem i have also... good to know im not alone ;-)

Ylianst commented 5 years ago

I will look into this. I made changes in that area yesterday, should be easy to fix.

Ylianst commented 5 years ago

Try to track this problem. To confirm, the agent is online but shows offline on the picture below? I can't make the problem happen now.

x

geanferrani123 commented 5 years ago

Ylianst, for machines that are properly connected but appear in the offline mesh console, it looks like this, see:

image

All machines appearing online in the meshcentral console are correctly displaying online agent status, see:

image

wvt-adm commented 5 years ago

yes, thats right...

Am 09.10.2019 um 19:31 schrieb Ylian Saint-Hilaire notifications@github.com:



Try to track this problem. To confirm, the agent is online but shows offline on the picture below? I can't make the problem happen now.

[x]https://user-images.githubusercontent.com/1319013/66505228-c77b3f00-ea7f-11e9-9389-acc805560b9c.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Ylianst/MeshCentral/issues/556?email_source=notifications&email_token=ANNQLXD7HGYK3OWB5LFSL2DQNYIM3A5CNFSM4I7APZS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYVXQQ#issuecomment-540105666, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANNQLXHCWMGIG4Q2N6BVRU3QNYIM3ANCNFSM4I7APZSQ.

MailYouLater commented 5 years ago

While these machines are saying they're offline, are you able to use any remote management features? or are they completely offline, even though the machine is on? Also, is this happening at random, or is it happening after some specific event? (e.g. are they disconnecting when the machine goes to sleep? or are they not reconnecting after the computer has been restarted?)

geanferrani123 commented 5 years ago

Enquanto essas máquinas dizem que estão offline, você pode usar algum recurso de gerenciamento remoto? ou eles estão completamente offline, mesmo que a máquina esteja ligada? Além disso, isso está acontecendo aleatoriamente ou após algum evento específico? (por exemplo, eles estão desconectando quando a máquina entra no modo de espera? ou não estão se reconectando depois que o computador foi reiniciado?)

they are completely offline

MailYouLater commented 5 years ago

And the other part of the question? (i.e. Is this happening when a computer goes to sleep, or after a computer has been restarted?)

Ylianst commented 5 years ago

Ok, I did not understand. In the title is says "offline on meshcentral console", I was under the impression it was the "console" tab. This is not the case, it's just offline in MeshCentral.

Yes, the agent should come back online each time the computer is started if the agent is installed. You will have to indicate what OS and version you are using, what agent type (32 or 64).

geanferrani123 commented 5 years ago

Ylianst, good morning!

Yes, that's what I said earlier, the whole problem is this, the machine is not listed online when the machine is turned on normally. When checking the agent status it appears as offline in the console.

geanferrani123 commented 5 years ago

The machines are windows 10 with 64 bit agent

MailYouLater commented 5 years ago

This may be related to, or a duplicate of #514.

If this is a dupe of #514, then setting the "Mesh Agent" service to "Automatic (Delayed Start)" mode, instead of "Automatic", should help.

geanferrani123 commented 5 years ago

Isso pode estar relacionado ou uma duplicata do número 514 .

Se este for um número idiota do número 514 , definir o serviço "Mesh Agent" para o modo "Automático (início atrasado)", em vez de "Automático", deve ajudar.

Hello friends, how are you?

You will test this suggestion MailYouLater, believe it will work, saw on one of the machines that shows this problem that the service neither started, whether you help in any way or case, is being poisoned or is not starting these cases.

I will test this fact and let you know.

geanferrani123 commented 5 years ago

Ok, eu não entendi. No título diz "offline no console do meshcentral", fiquei com a impressão de que era a guia "console". Não é esse o caso, é apenas offline no MeshCentral.

Sim, o agente deve voltar a ficar on-line sempre que o computador for iniciado, se o agente estiver instalado. Você precisará indicar qual sistema operacional e versão está usando, que tipo de agente (32 ou 64).

Hello Ylianst, all right?

Friend, I'm sorry, as I don't speak English very well, I'm having to use google translator to communicate with you, so maybe that's why you didn't understand me.

geanferrani123 commented 5 years ago

This may be related to, or a duplicate of #514.

If this is a dupe of #514, then setting the "Mesh Agent" service to "Automatic (Delayed Start)" mode, instead of "Automatic", should help.

Friend, it did not solve, and it does not make much sense either because stopping to analyze, the agent has no dependencies to set as delay on startup.

Ylianst, do you have any suggestions on what I can do? does upgrading to * -e solve this problem as well?

geanferrani123 commented 5 years ago

Guys, can you help me? I saw that when this happens, it is because the service actually stops.

MailYouLater commented 5 years ago

Have you checked the Event Viewer to see if it logged some useful information for diagnosing the issue?

geanferrani123 commented 5 years ago

Yes, Here's to help:

image

image

image

image

image

image

image

geanferrani123 commented 5 years ago

These prints are from my machine, after I reinstalled the agent, so far not have this problem.

geanferrani123 commented 5 years ago

If you need to use google translator to translate, this language is portuguese - brazilian

MailYouLater commented 5 years ago

You said the server was working, but some machines weren't appearing even though they're on. Those screenshots are relating to the MeshCentral server, and have nothing to do with the agent. Perhaps it's better if you check a computer that's not connecting to the server for a .log file next to the meshagent executable (after you've seen the problem happen again).

geanferrani123 commented 5 years ago

Friend, these captures were taken on my workstation, I'm not running the server on it.

Where can I go to get the agent log?

MailYouLater commented 5 years ago

The agent installs to C:\Program Files\Mesh Agent, the server installs (via the installer) to C:\Program Files\Open Source\MeshCentral. Those screenshots are of event logs for the server, not the agent.

geanferrani123 commented 4 years ago

Friends see if it's correct:

image

This print is from my workstation, I will see if during the day I can pick from another station to have better knowledge

MailYouLater commented 4 years ago

That's the .log file I mentioned. It looks like the meshagent is crashing. @krayon007 will likely want to see a core dump. Can you add a line that says coreDumpEnabled=1 to the .msh file in the same folder, then start the meshagent, and next time it crashes send the .dmp file it creates to @krayon007?

geanferrani123 commented 4 years ago

Sure, but I think on my machine it will be hard to happen again because when there was the problem I reinstalled the agent and it didn't happen again. I need to find another machine to better simulate the problem and collect the information you mentioned.

geanferrani123 commented 4 years ago

Hello Guys, How are you?

See one more error log bellow:

image

this machine went offline out of nowhere in meshcentral and then normalized

Ylianst commented 4 years ago

This is an agent problem for @krayon007. For reference this error with Delta 346466 points to this line number:

[BaseAddr: 0x00007ff6fc7c96dc, duk__mark_heaphdr => c:\tmp\meshagent\meshservice\duk_heap_markandsweep.c:188]

It looks like there is an error in the garbage collection. Also note that you can can add "coreDumpEnabled=1" in the meshagent.msh file like this (Added to the last line):

MeshName=Lab-Computers
MeshType=2
MeshID=0xEDBE1B...
ServerID=D99362...
MeshServer=wss://devbox.mesh.meshcentral.com:443/agent.ashx
ignoreProxyFile=1
webSocketMaskOverride=1
coreDumpEnabled=1

Then, restart the agent and on the next crash the agent will generate a large dump file that you can sent to @krayon007 for review. Because the dump file could contain private information, it's best to password protect it / send it directly.

One more thing, you can add the following line to the domain setting of MeshCentral to have all agents installed with core dumping enabled.

"agentConfig": [ "coreDumpEnabled=1" ]

When installing the agent, the msh file will have this extra line added by the server.

geanferrani123 commented 4 years ago

Excellent, as this is a test computer, it will be possible to apply this definition more easily, I will try to simulate the problem and post the result.

Ylianst commented 4 years ago

Quick update: @krayon007 is able to make a similar error happen on the agent and is looking into it. This said, I am only one more work day tomorrow and will start traveling a lot for 3 weeks. I don't want to risk releasing a new MeshAgent unless it's going to get a lot of testing. So, will probably wait until I get back to publish a new agent.

Also, this is a duplicate of an issue reported on MeshAgent #21. But let's keep both open for now, I don't go in the MeshAgent GitHub much.

Ylianst commented 4 years ago

Just to make it clear, this line:

"agentConfig": [ "coreDumpEnabled=1" ]

Will add this line to the .msh when you download an agent from the server. Existing installations will not be affected.

geanferrani123 commented 4 years ago

Guys, follow another log from another station.

Sem título

Ylianst, I will try to enable the function you mentioned above today to simulate the problem.

geanferrani123 commented 4 years ago

Friends, I have not posted the result of the core dump because I could not simulate the problem. Yesterday something funny happened, a machine went offline, I restarted the service and it did not come back, I had to reinstall the agent on the station in question and restart the machine for her to come back online.

ckibodeaux commented 4 years ago

Sorry to resurrect an old thread, but seeing as it's still an open issue I thought it best to post here as I'm having the same problem. Random machines (so far has happened to 4 of them) show the agent as offline. When looking at the Windows service, it says the agent is running. Restarting the service brings the agent back online. Nothing in the log file. Running latest version 0.5.46. agent-offline service-running MeshAgent.log

krayon007 commented 4 years ago

I'm adding some code to the Agent, so you'll be able to run another instance of the agent from the command line, to connect to the service instance, to do some diagnostic/testing, to see what the agent is doing...

ckibodeaux commented 4 years ago

I haven't seen it happen again since I've posted the comment. Perhaps one of the updates resolved whatever was causing the original issue.

krayon007 commented 4 years ago

The next agent update contains some additional command line parameters that will be able to aid in troubleshooting, should you have this issue again. I added a -state parameter, so you can connect to the service instance, and the service instance will dump it's current state, so you'll be able to see if the service is still responsive, what descriptors are open, and it's connection state with the server.

ckibodeaux commented 4 years ago

This happened again. Services says it's running but when I try MeshAgent.exe -state I get Unable to contact Mesh Agent... I'm on version 0.5.55. There is nothing in the MeshAgent.log file. I copied the MeshAgent.msh and MeshAgent.db file in case it can help. After restarting the service, everything appears fine.

joeydc commented 3 years ago

This is a wonderful project but I've been having this issue for so long now.. It's kind of unreliable since you don't know when the agent will go offline. I have 2 MeshCentral instances with this random issue. One for personal use (Arch Linux VM behind an NGINX reverse proxy) and one for work (Debian 10 hosted on a VPS). It happen with agents on Windows 7, Windows 10 and Windows Server but I never saw this happening on Linux.

Anyone know if there's any progress fixing this ?

baishi commented 3 years ago

I managed to achieve 35 minutes idling uptime for now after fiddling with nginx a little.

Namely adding agentPong: 30 to the server config and set ssl_buffer_size 8; on nginx. Note proxy_buffering off does not help at all in my situation.

Refer to #2831 for details.

dinger1986 commented 11 months ago

maybe docs for this but working @si458