Open b8two opened 3 years ago
Hi. So, in general CIRA is difficult to debug because Intel AMT will give you no information as to why it can't connect. I will first note that CIRA connection is initiated by Intel AMT below the operating system and Intel AMT can only use it's managed network interfaces to perform the connection. So, you need to have the device connected using Ethernet or WIFI with the network interface at is attached to Intel AMT. In addition, most Intel AMT versions do not support HTTP proxies, so if the device is behind one of these, it's not going to work. Starting with AMT 12 I think, there is HTTP proxy support, but MeshCentral would have to configure that is it does not right now.
The Intel AMT port for CIRA on MeshCentral is port 4433 by default. This port is not always accessible from other networks due to firewalls. To check this, try to connect to your server from the Intel AMT machine using a browser on port 4433 with HTTPS... something like https://myserver:4433. CIRA is a binary protocol, but if MeshCentral detects the request looks like an HTTP request, it will respond with HTTP, great for testing. This is always the first test I personally do.
Port 4433 is setup with a special TLS certificate that is going to be trusted by Intel AMT. You can't change this certificate and since the protocol is binary, you can't do anything with a reverse proxy except to directly forward your external port 4433 to MeshCentral.
For the unknown node, I get that too sometimes. In general, the MeshAgent will look at the Intel AMT UUID and send that to the server. When a incoming CIRA connection is established, the UUID is used to match the device. If there are no devices with the same UUID, it can't find that device.
For tracing, you can always turn on the CIRA server tracing in the "My Server / Tracing" tab.
I hope that helps for a start, indeed, not easy to debug.
Hi Ylianst,
Thanks for your reply.
I've enabled the tracing but it isn't very helpful. There is no additional information in this log like an IP address to identify if the new / closed connection is from one device or many. The couple of closed connections with mesh and node hashes isn't descriptive also, all I can tell is that two of my devices from the same mesh that are known have closed the connection. Or if the connection dropped, would the message be different in the log?
12:02:07 PM - MPS: CIRA connection closed
12:02:07 PM - MPS: New CIRA connection
12:02:07 PM - MPS: CIRA connection closed
12:01:59 PM - MPS: New CIRA connection
12:01:58 PM - MPS: CIRA connection closed
12:01:56 PM - MPS: CIRA connection closed
12:01:56 PM - MPS: New CIRA connection
12:01:51 PM - MPS: New CIRA connection
12:01:30 PM - MPS: CIRA connection closed
12:01:23 PM - MPS: New CIRA connection
12:01:10 PM - MPS: CIRA websocket closed, mesh//2RKbl1TeIIeRPR5wnS6EuX26CaQkyxVGjNE38Ls3@X0WY0BFhQciXI6evH1EuFRb, node//rXthfJmfQ0Tul6beSioacwgkj7RMxZHo0hlX5W1EUn2TG@mAMdsnfl@INeB5XhRs
12:00:59 PM - MPS: CIRA connection closed
12:00:54 PM - MPS: New CIRA websocket connection
12:00:52 PM - MPS: New CIRA connection
12:00:52 PM - MPS: CIRA websocket closed, mesh//2RKbl1TeIIeRPR5wnS6EuX26CaQkyxVGjNE38Ls3@X0WY0BFhQciXI6evH1EuFRb, node//v3IlDXrjROWF1lCIEEb9v7Z4p$ZhUdwKQxnUXipEc4xCj2iCajGunNeevunIbRSO
12:00:50 PM - MPS: CIRA connection closed
12:00:47 PM - MPS: CIRA connection closed
12:00:43 PM - MPS: New CIRA connection
12:00:40 PM - MPS: New CIRA connection
12:00:31 PM - MPS: CIRA connection closed
12:00:29 PM - MPS: New CIRA websocket connection
12:00:24 PM - MPS: New CIRA connection
12:00:15 PM - MPS: CIRA connection closed
12:00:15 PM - MPS: New CIRA connection
11:59:59 AM - MPS: CIRA connection closed
11:59:52 AM - MPS: New CIRA connection
11:59:44 AM - MPS: CIRA connection closed
11:59:37 AM - MPS: New CIRA connection
I have no idea what the error was in this situation but I do know that this "error log event" is not helpful to continue any troubblshooting.
2:13:40 PM - MPS: New CIRA connection
12:13:24 PM - MPS: CIRA connection closed
12:13:23 PM - MPS: CIRA connection error, [object Object]
12:13:17 PM - MPS: New CIRA connection
I now realise another complexity that I left out, I'm using port 50000 for MPS connections and this is not part of a proxy but just forwarded through. Where the HTTP is part of a reverse proxy with TLS offload enabled.
I have already tested that this works from local and remote locations with the correct output.
MeshCentral MPS server.
Intel® AMT computers should connect here.
I have Two stable MPS connected devices, One in the same network as the mesh server but still using the reverse proxy. Another is my Laptop that has the Wired ethernet connected and is working slowly from my home connection.
Do I need to increase any timeout options?
@si458 can close
I think I'm on the way to figuring this out.
I have since migrated the selft hosted instance to a cloud provider, which gives me flexibility with ports and improved connection speed+uptime.
I have discovered that when I restart mesh server there are about 93 iAMT connections that eventually settle down to around 12. I have now discovered that the Intel AMT UUID on 4 models of computers from a couple of manufacturers are the same and not unique. "03000200-0400-0500-0006-000700080009"
I also have found there is a AMI BIOS tool with a command to make it unique: AMIDEWINx64 /SU Auto
@b8two well that's an interesting find! I will have to verify this with @Ylianst! Because my understanding was every motherboard should have a unique uuid as its identifier! So the fact a certain motherboard have the same uuid is very worrying!
@si458 I agree it is an issue.
I've just updated the UUID on all the devices (using group upload and run), however I'm now stuck with the old UUID as "remembered" by iAMT still being the old static.
I needed to restart the agent to read the smbios information after the update, is there a way (Besides power off and on) for iAMT to do the same?
> smbios
{
systemInfo: {
uuid: "71630800-64d2-11ef-9205-942f8b5de38e"
wakeReason: "Power Switch"
}
systemSlots: {
uuid: "71630800-64d2-11ef-9205-942f8b5de38e"
wakeReason: "Power Switch"
}
amtInfo: {
AMT: true
enabled: true
storageRedirection: true
serialOverLan: true
kvm: true
TXT: false
VMX: true
MEBX: "9.0.0.28"
ManagementEngine: "9.1.41.3024"
}
}
> amt
{
core-ver: 1
Flags: 2
MeiVersion: "11.7.0.5380"
Versions: {
Flash: "9.1.41"
Netstack: "9.1.41"
AMTApps: "9.1.41"
AMT: "9.1.41"
Sku: "8200"
VendorID: "8086"
Build Number: "3024"
Recovery Version: "9.1.41"
Recovery Build Num: "3024"
Legacy Mode: "False"
}
UUID: "03000200-0400-0500-0006-000700080009"
ProvisioningMode: 1
ProvisioningState: 2
}
also, if you search online for "UUID 03000200-0400-0500-0006-000700080009" it is a much more common problem.
Since this is an issue, can you provide some way in mesh central to provide a list of UUIDs devices that are duplicates?
@b8two from my knowledge and learning,
You will need to physically unplug/plug the machines in to kick amt into rebooting
You could try unprovisioning and reprovisioning?
server console: agentissues has returned about 77 "duplicateAgent", I'm assuming this is tied to the UUID?
server console: agentstats duplicateAgentCount: 77
server console: dupagents There are a few with a ,"count":1 , Is that a single duplicate, hence at least two devices?
I have also been able to remotely power cycle a couple of iAMT machines after updating the UUID but this has resulted in less iAMT connections overall. In one case, after power cycling the iAMT machine, iAMT no longer responds. not sure If the Random UUIS has now killed iAMT. I retried a 30 second power relay cut after shutdown but it did not resolve the issue.
FYI:
Server Console> mps node//LLSSQFNOQJbIh@JJE9j6c49eZX4UCD1OlssBftG@Eilo8uAyRI4fSo1AreT, CIRA, CIRA, CIRA, CIRA, CIRA, CIRA node//$WXNaHHgzhIzgbLRg05ed1j7CYarOWmSRk9tON2eLGNQNQnwe$Dcli8JGE, CIRA
^ cut 5 random node charactors. The device with the 7 CIRA connections, It appears to radomly select the actual device behind the connection. Can you also the use MAC address of the Network card to make the connections unique?
What I can see more of in mps Tracing is ECONNRESET
Is there a way to obtain more details in the tracing?
server console: agentissues has returned about 77 "duplicateAgent", I'm assuming this is tied to the UUID?
@b8two
the duplicateAgent
is to do with the fact a device was connected
but for some MAD reason it decided to open a whole new connection?
(maybe internet dropped out, mad latency, lost connection, who knows?)
so the previous connection to that device is then closed and we just let you know as a duplicateAgent
and then duplicateAgentCount
is always +1 and will keep increasing
and dupagents
is just a list of duplicate agents over time and there counts
nothing to do with UUIDs at all
node//LLSSQFNOQJbIh@JJE9j6c49eZX4UCD1OlssBftG@Eilo8uAyRI4fSo1AreT, CIRA, CIRA, CIRA, CIRA, CIRA, CIRA
@b8two this to me doesnt seem right as you explained, it seems to think that single device has 6 CIRA connections, when in reality you should only have 1 (maybe 2 if internet drops for whatever reason)
the way to fix that is as you explained before, make the motherboard UUIDs unique, then i think you going to have to unprovision the AMT, then reprovision it! which sadly means a site visit if you want to use the ACM, unless u can get the meshcmd to activate with ACM
I have mesh 0.7.40 configured behind a reverse proxy (Apache) and Agent connections are working well. I tried using Apache for MPS connections also but I found port fowarding to be the simplest solution.
I have about 77 devices with iAMT mostly in remote locations, and they are in multiple states of configuration / iAMT versions. i.e.
However I've only seen about 6 nodes with Agent + CIRA connections. 2 of these are machines I use and the other 4 are not always with "+ CIRA" but are online in the same network. It would be awsome to know what I would need to do to resolve the configuration issues with the different nodes.
There are a couple of nodes indicated as "Invalid Credentials", however I happened to be on one with this issue and I was able to use the known password to configure iAMT items from the BIOS level (including changing the password), however mesh is still unable to authenticate correctly. An identical machine at the same location works as expected, however no "+ CIRA" and I'm assuming it is due to the common modem 4G connectivity at this location.
I have a console prompt of CIRA connection for unkown node. It occurs frequently and appears to be only a single node (groupid & uuid always the same). How do I add this unkown node in Mesh and why is there no tracing option for notifications? (i.e. I can't see it in the GUI, only Console)
I'm unsure of the cause but I have multiple entries in mesh for the same machine. It would be nice if this didn't occur, however an alternative would be to have a group action to Merge nodes. Basically I have Device Name & notes I'd like to keep and it is extra work to maintain this each time a node makes a new entry.