Ylianst / MeshCentral

A complete web-based remote monitoring and management web site. Once setup you can install agents and perform remote desktop session to devices on the local network or over the Internet.
https://meshcentral.com
Apache License 2.0
3.88k stars 528 forks source link

ACM Activation Issues #2948

Open bakerbuds opened 3 years ago

bakerbuds commented 3 years ago

I have a decent number of machines that have trouble activating to ACM. Some have no problem, and some do.

I'm running the latest Server version 0.8.85 on a Windows Server (fresh install after having the same troubles with a server installed on Ubuntu). I have the vPro cert from DigiCert, DHCP option 15, and so on. DNS suffix stuff all seems to pull correctly, etc.

I've been following issue 2805 closely as some of what aztlanla was experiencing seemed similar.

The common symptom seems to be that they get stuck on "Attempting TLS Activation" when looking at amtevents. This is when the group is set to either Fully Automatic or ACM w/out CCM.

If I attempt to run meshcmd and add the device to an AMT only device group with the debug flag, I get the below. I've omitted the first lines that contain all the certificate stuff etc. as I wasn't sure if there was anything there that I shouldn't be posting online. I also replaced the hostname and cert hash below. None of this is Internet facing, so if you say it's safe and want me to include it all, just let me know.

APF: Send protocol version 1 0 26bdff58-bfb2-11eb-99f4-7ad604483000
APF: Send service request auth@amt.intel.com
APF: Service request to auth@amt.intel.com accepted.
APF: Send username password authentication to MPS
APF: User Authentication successful
APF: Send service request pfwd@amt.intel.com
APF: Service request to pfwd@amt.intel.com accepted.
APF: Send tcpip-forward _hostname_:16992
APF: Request to port forward 16992 successful.
APF: Send tcpip-forward _hostname_:16993
APF: Request to port forward 16993 successful.
APF: Start keep alive for every 60000 ms.
APF: JSON_CONTROL
Checking Intel AMT state...
APF: JSON_CONTROL
Performing TLS ACM activation...
APF: JSON_CONTROL
APF: Send JSON control: {"action":"startTlsHostConfig","value":{"status":0,"hash":"CertHash"}}
APF: JSON_CONTROL
Attempting TLS connection...
APF: CHANNEL_OPEN request: {"cmd":90,"chan_type":"forwarded-tcpip","sender_chan":4,"window_size":32768,"target_address":"","target_port":16993,"origin_address":"1.2.3.4","origin_port":1024,"len":55}
APF: Send ChannelOpenConfirmation
APF: Send ChannelData: 15030000020250
Socket ends.
APF: Send ChannelClose
APF: Send keepalive request
APF: Send keepalive request
APF: Send keepalive request
APF: Send keepalive request
APF: Send keepalive request

It just goes on forever like that. MeshCmd AmtIfno gives me this.. (replaced actual suffix)

PS C:\Program Files\Mesh Agent> .\meshcmd.exe AmtInfo
Intel AMT v14.1.53, in-provisioning state.
Wired Enabled, DHCP, D8:BB:C1:2A:29:38, 10.112.2.66
DNS suffix: correctDNSsuffix.com
Connection Status: Direct, CIRA: Disconnected.
Ylianst commented 3 years ago

'Performing TLS ACM activation...' is a new activation technique that is available starting with Intel AMT v14 that is more secure than the old technique. This said, I only have one AMT v14 to test with and so, maybe this technique is does not work well. One think I can add is a switch to force the server to activate to ACM using the older technique.

Otherwise, everything looks good for you. I don't see any issues with the DNS suffix matching.

bakerbuds commented 3 years ago

Thanks, Yilanst.

Would that be a server side setting, or a switch needed to run with MeshCMD during activation? I only ask because the only time I have been manually running the MeshCMD activations against the AMT Only group is for seeing the debug info when the machines are failing activation against a group using the agent and an AMT Policy.

Ylianst commented 3 years ago

This would be a server side switch. Let me look at this now.

bakerbuds commented 3 years ago

Awesome. Thanks, Yilanst.

Ylianst commented 3 years ago

Publishing v0.8.87 now, should be online in a few minutes. You can add this new option:

{
  "domains": {
    "": {
      "AmtManager": {
        "TlsAcmActivation": false
      }
    }
  }
}

Let me know if that works.

bakerbuds commented 3 years ago

perhaps I'm doing it wrong? Still seeing the "Performing TLS ACM Activation" when running the AMT Only MeshCMD. Updated server, updated config.json, rebooted server. From my config.json...

  "domains": {
    "": {
      "AmtManager": {
        "TlsAcmActivation": false
      },
      "_siteStyle": 2,
      "title": "MyCompany Meshcentral",
      "title2": "MyServerName",

That's obviously just the top of the "domains" section. I saw that (because I'm essentially using a full config.json and just removing the on the parts I'm using) there is already an AmtManager section in there. I have not enabled it (still has the ). Instead of using that, I just added only the above piece as I didn't need or want to use the other pieces within it.

      "_amtManager": {
        "adminAccounts": [{ "user": "admin", "pass": "MyP@ssw0rd" }],
        "environmentDetection": [ "domain1.com", "domain2.com", "domain3.com", "domain4.com" ],
        "wifiProfiles": [
          {
            "name": "Profile1",
            "ssid": "MyStation1",
            "authentication": "wpa2-psk",
            "encryption": "ccmp-aes",
            "password": "MyP@ssw0rd"
          }
        ]
      },

Is that ok? Here is the --debug when running manual MeshCMD against AMT Only group...

APF: Send protocol version 1 0 26bdff58-bfb2-11eb-99f4-7ad604483000
APF: Send service request auth@amt.intel.com
APF: Service request to auth@amt.intel.com accepted.
APF: Send username password authentication to MPS
APF: User Authentication successful
APF: Send service request pfwd@amt.intel.com
APF: Service request to pfwd@amt.intel.com accepted.
APF: Send tcpip-forward hostname:16992
APF: Request to port forward 16992 successful.
APF: Send tcpip-forward hostname:16993
APF: Request to port forward 16993 successful.
APF: Start keep alive for every 60000 ms.
APF: JSON_CONTROL
Checking Intel AMT state...
APF: JSON_CONTROL
Performing TLS ACM activation...
APF: JSON_CONTROL
APF: Send JSON control: {"action":"startTlsHostConfig","value":{"status":0,"hash":"CertHash"}}
APF: JSON_CONTROL
Attempting TLS connection...
APF: CHANNEL_OPEN request: {"cmd":90,"chan_type":"forwarded-tcpip","sender_chan":4,"window_size":32768,"target_address":"","target_port":16993,"origin_address":"1.2.3.4","origin_port":1024,"len":55}
APF: Send ChannelOpenConfirmation
APF: Send ChannelData: 15030000020250
Socket ends.
APF: Send ChannelClose
APF: Send keepalive request
APF: Send keepalive request
APF: Send keepalive request

When that didn't work, I installed the agent again and ran the below commands...

> amtevents
16:51:00, LMS tunnel start.
16:51:00, Checking Intel AMT state...
16:51:02, Failed to get Intel AMT state.
16:51:02, LMS tunnel closed.
> agentupdate
Downloading update from: https://myserver:443/meshagents?id=4
Download complete. HASH verified.
Updating and restarting agent...
> amtevents
> amtevents
No events.
> amtevents
No events.
> amtevents
No events.
> amtconfig
Started Intel AMT configuration
> amtevents
16:51:40, User LMS tunnel start.
16:51:40, Device group not found (2)
16:51:40, User LMS tunnel closed.

That's the update, I guess. Let me know if perhaps I've done something wrong in my config.json? Thanks, Yilanst

bakerbuds commented 3 years ago

Yilanst,

I must apologize. On a whim, I went to check and see if there was another server version update this morning, and somehow my version was still on v0.8.85. I know I went in yesterday and updated and even cross-checked the version number I was about to update to with your post, etc. etc. You know, one of those things.

After a good face-palm, I updated, crossed my fingers, and voila,

08:52:27, LMS tunnel start.
08:52:27, Checking Intel AMT state...
08:52:27, Getting ready for ACM activation...
08:52:28, Performing ACM activation...
08:52:30, Succesfully activated in ACM mode, holding 10 seconds...
08:52:53, Intel AMT connected.
08:52:53, Performing clock sync.
08:52:55, Performing Commit()...
08:52:55, Enabled TLS, holding 10 seconds...
08:53:07, Intel AMT connected with TLS.
08:53:08, Added server root certificate.
08:53:09, Created new MPS server.
08:53:09, Created new MPS policy.
08:53:09, Environment detection set.
08:53:10, Cleared user consent requirements.
08:53:10, Changed device name: hostname.CorrectDomain.com
08:53:11, Enabled redirection features.
08:53:11, Enabled KVM.
08:53:11, Done.
08:53:11, LMS tunnel closed.

The only thing that doesn't seem to be working is the H/W Connect. Seems to be greyed out despite the machine being activated in ACM. I'm wondering if I don't fully understand when I should be able to connect via H/W. I thought it was an ACM thing.

Either way, Thank you VERY much for taking a look and implementing that switch. I'm at least over a large hurdle now of even getting the machines activated. I really appreciate it.

bakerbuds commented 3 years ago

Also, i have a couple machines that seem to only activate to CCM. When I attempt to force ACM, they give this...

09:12:29, LMS tunnel start.
09:12:29, Checking Intel AMT state...
09:12:29, No opportunity for ACM activation, trusted FQDN: CorrectDomain.com
09:12:29, LMS tunnel closed.

If any of this is unrelated and you'd prefer I submit a new issue, let me know. Thanks.

Ylianst commented 3 years ago

It looks like there are 3 issues...

  1. There is certainly a problem with TLS ACM activation on AMT 14+ that I need to investigate.

  2. Looks like the certificate FQDN is not matching in some cases:

09:12:29, No opportunity for ACM activation, trusted FQDN: CorrectDomain.com

In this case, does the CorrectDomain.com match exactly the certificate you have? Capital letters and all? I will put in a change to print on the "CorrectDomain.com" in HEX so to make sure there is not hidden characters, like spaces at the start or end, etc. Capital letter should not matter, but it's odd it's not matching your certificate common name.

  1. There is the group not found issue...
16:51:40, Device group not found (2)

I was thinking I may have fixed this, but I am going to put in a change to print out the device group so we can check what device group identifier it's looking for.

bakerbuds commented 3 years ago

Thanks, Yilanst.

  1. Thanks, and I'll wait to hear more on this. Could this have to do with the H/W Connect option being greyed out?

  2. Yes. It matches exactly. And it's the same cert that is successfully activating other machines into ACM.

At one point, initially, my vPro cert was issued to mc.mydomain.com, but when I was having these issues, I re-issued the cert from DigiCert to just mydomain.com since that is what my DHCP option 15 hands out throughout the network. Some digging in some of the other GitHub issues let me to believe it should be the way it is now. Let me know if you believe otherwise. Again, though, this is the same server, cert, setup, etc. that is activating other machines successfully into ACM.

  1. Thanks. I will wait to hear on this as well. The device definitely shows up within the correct group on MC and it's using the agent / agent installer from that group. An oddity is that sometimes when they give this error, I can run the agentupdate cmd or switch the group's AMT Profile back and forth from ACM to Fully Automatic and it will stop throwing this error, but then perhaps throw the no opportunity error, or even perhaps just activate into CCM. Little odd.

Thanks again for all the help. I'm always impressed when reading through the issues or checking out the subreddit with how responsive and helpful you are. It's very much appreciated.

Ylianst commented 3 years ago

On the topic of why the connect button is grayed out, I noticed that your configuring your devices for Client Initiated Remote Access (CIRA):

08:53:09, Created new MPS server.
08:53:09, Created new MPS policy.
08:53:09, Environment detection set.

As a result, Intel AMT will attempt to contact your server on the MPS port which is generally port 4433. Can you test https://mc.mydomain.com:4433 using a browser? It will show an invalid certificate on that port, but that is expected. If you accept the certificate, you will get a message in the browser that this port is for incoming Intel AMT connections.

image

Make sure this port is not blocked and that Intel AMT hardware network interfaces are connected to the network and can reach that port.

On the topic of certificates, as long as the option 15 matches or is a sub-domain of your AMT activation certificate, that should work. So, option 15 can be aa.bb.com and your cert is aa.bb.com or bb.com, that should be ok. The DNS name of your MeshCentral server does not matter at all, you can name your server abc.com and it would not matter. Also, it's possible for you to configure many Intel AMT activation certificates with different names (aa.com, bb.com...) at the same time. MeshCentral should match and use the correct one each time.

Let me make a few changes and release a new version shortly. Hopefully that will help.

Ylianst commented 3 years ago

I just published MeshCentral v0.8.88.

You may know this already, but use the amtacm command in the My Server / Console tab to see what ACM certificates are configured and what is the common names for them:

image

If the new output contains sensitive data, your free to use my contact info if you like.

bakerbuds commented 3 years ago

Thanks, Yilanst.

I'll check this all out as soon as I get back to my desk.

°°° Please x cuse any typos. I typed this mess age using very large thumbs on a very small keyboard. °°°


From: Ylian Saint-Hilaire @.> Sent: Wednesday, July 28, 2021 3:17:10 PM To: Ylianst/MeshCentral @.> Cc: Jon Rosenlund @.>; Author @.> Subject: Re: [Ylianst/MeshCentral] ACM Activation Issues (#2948)

I just published MeshCentral v0.8.88.

You may know this already, but use the amtacm command in the My Server / Console tab to see what ACM certificates are configured and what is the common names for them:

[image]https://user-images.githubusercontent.com/1319013/127403209-ed257aa0-7e2f-4423-886f-54e589125610.png

If the new output contains sensitive data, your free to use my contact info if you likehttps://www.meshcommander.com/contact-information.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Ylianst/MeshCentral/issues/2948#issuecomment-888657057, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AITKPW2STJ75KDYCUKFMGSLT2B6WNANCNFSM5BC3MN7A.

bakerbuds commented 3 years ago

Thanks, Yilanst.

Sent you an email.

Ylianst commented 3 years ago

For anyone looking into this thread. MeshCentral Intel AMT port 4433 uses a very specific certificate to allow Intel AMT to connect on this port. This TLS certificate can't be replaced with a different one. The MeshCentral MPS TLS certificate must be used for incoming connection and the Intel AMT protocol is not HTTP/HTTPS, instead it's binary. So, it's not recommanded to put a reverse proxy in from of this port, instead just TCP route the external port 4433 right to MeshCentral port 4433.

Ylianst commented 3 years ago

One thing I am looking at now is certificate hash matching...

meshcmd amthashes should have a hash the exactly matches the amtacm command on the server. SHA1 or SHA256 should match.

image

Ylianst commented 3 years ago

FYI. In next version of MeshCentral, when you type "amtconfig" in the agent console, it will enable the new Intel AMT configuration live view mode and show you what is going on in real-time. No need to type "amtevents" anymore. I wanted to add that for a while now, makes things so much easier to debug.

image

Ylianst commented 3 years ago

Just checked and ACM activation works on my AMT v14.0.45 machine. So, SHA1 and SHA256 cert matching seem to both work.

image

bakerbuds commented 3 years ago

Yilanst,

  1. You post about matching the amthashes with the amtacm made me curious. On one of the machines that will only activate to CCM and continues to give me the "No opportunity for ACM activation" with your new more detailed "No matching activation certificate for 'mycertdomain.com;" error, I ran the amthashes command and it turns out that the machine is missing the DigiCert Global Roots. On a machine that has activated to ACM successfully, I have:

    DigiCert Global Root CA, (Default, Active)
    DigiCert Global Root G2, (Default, Active)
    DigiCert Global Root G3, (Default, Active)
    DigiCert Trusted Root G4, (Default, Active)

    All with their hashes listed, but the machine that only activates to CCM is missing these. Seems to have all the others Authorities' Roots, but not DigiCert's. Seen that before?

  2. amtconfig with live updates will great! Good thinking.

  3. My AMT v14 machines have been activating since you disabled TLS for them. I think I let you know that already via email, but wanted to make sure you were aware.

  4. Most of my machines, even when activated to ACM, still have a greyed out H/W Connect button. Only machines it seems to be working on are the few I have that established CIRA. I'm still trying to find some consistent commonality about these to be able to report to you.

Thanks again for all the work!

Ylianst commented 3 years ago
  1. Ha!!! This is great!!! Well, bad news for activation, but good news that this is solved. I have not been keeping track of what machines have what root certs... I could try to ask the firmware team if they know anything. If that root CA was added recently, it's not going to work for older devices. You could create a USB key and manually add your MeshCentral cert that way if there are not many of these devices. Otherwise, may need to get a cert from a different CA.
  2. Yes, it's nice. Will release that later today.
  3. At some point I will take a look at this again. It's disabled by default now and I will re-enable it when I find the issue.
  4. This is the main problem now. If you go on the Intel AMT machine and use MeshCommander to connect to "127.0.0.1" and have LMS installed (or run "meshcmd microlms") you can go to the "Internet Settings" tab and should see the CIRA configuration. Should look something like this.

image

Try the platform on wired Ethernet connected to the Intel AMT managed port and make sure you can hit your meshcentral server port 4433 from that device using a browser. Also, your server must have a problem DNS name, not a WINS or NetBios name. Intel AMT will have a request to your DNS server to resolve your server's name.

CIRA connectivity problem is difficult to debug because Intel AMT gives you no feedback as to what is going wrong and even when it will try to connect again.

bakerbuds commented 3 years ago
  1. Right!?! It's very strange. It's v12.0.6, It should have them. I have older machines, even one i checked on v11.8.86 that has them there. And according to this https://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide/default.htm?turl=WordDocuments%2Frootcertificatehashes.htm Should be there. Unless Lenovo decided to take them away on just this one. VERY strange. I will see about manually configuring or getting a new cert from an Authority who's root is present on those device. For now, I think it is just two machines.

4 Machine that is activated in ACM, I connect via meshcommander (either remotely or locally) and myserver.mydomain.com:4433 is correctly listed. I am also able to browse to that in a browser. Something to note that feels odd, is that unless I use InPrivate mode, I'm not able to continue past the certificate warning in the browser. no option to. This is using chromium based edge. Remember, I had previously been trying to use a 3rd party cert on the MPS until you informed me not to. At that point, I removed the MPS cert and key, rebooted, and MC recreated the built in ones. Just wanted to mention in case you thought I may have screwed something up with all that.

bakerbuds commented 3 years ago

Quick addition to point 4 above.

Ylianst commented 3 years ago

FYI. Next version of MeshCentral will have an improved error message for the case where the DNS suffix matches a certificate, but the root hash does not. I disabled the GoDarry cert hash in MEBx and was able to test this.

image

Ylianst commented 3 years ago

??? Not sure about this. Port 4433 uses a certificate that should be un-trusted by all browsers. It's a special certificate that is signed by your MeshCentral's server private root cert and that root cert will be loaded into Intel AMT. So, Intel AMT will connect correctly, but ALL browsers should show a warning. Any situation where connecting to 4433 does not show the right private certificate will cause CIRA connection to fail. Please make sure all browser get a warning and see the same privately signed cert on port 4433. Obviously, port 4433 is not intended for browser connections, I did put in a HTTP response for debugging.

??? If CIRA is configured correctly, the remote Intel AMT ports should be closed and you should not be able to connect MeshCommander locally thru LMS. If you can still connect MeshCommander remotely (from a different machine) then there is a problem. You could be using an Intel AMT device that does not support CIRA... or you have environment detection setup incorrectly.

When you click "Environment Detection", do you see 1 value that looks random? If so, that is normal. If not, you may want to post details in private for debugging.

image

Ylianst commented 3 years ago

The way Intel AMT environment detection works is that you get to specify 1 to 4 DNS suffix. Then Intel AMT is in a network that matches one of the DNS suffixes, it will consider itself "home" and will only it's remote ports (16992 to 16995) and not perform CIRA. When it's in a network that does NOT match one of the environment detection DNS suffix, it will close it's remote ports and attempt to connect to MeshCentral using CIRA.

So, if you want CIRA, you don't want the DNS suffix in environment detection o match your network. A good way to do this is to configure a random value so that is will never match and CIRA will always be used.

It look like you may have put your DNS suffix in the "EnvironmentDetection" section of the config.json like this:

      "AmtManager": {
        "EnvironmentDetection": [ "mydomain.com" ]
      }

In that case, CIRA will not trigger. Change "EnvironmentDetection" to "_EnvironmentDetection" to eliminate this section. That will cause MeshCentral to put a random value instead and that should solve the CIRA problem.

Hope that helps, Ylian

bakerbuds commented 3 years ago

Yilanst,

Thanks for all the help with all of this. For the most part, I think most everything is resolved and I'm there. I will do some thorough digging in another week or so as I will be completed disconnected this upcoming week. A couple quick notes below, and then when I'm back in a week or so I will try to give a more detailed summary of the issues I was seeing along with the resolutions and/or fixes/findings.

That's it for now, but I will definitely come back and give some more summary about all the rest when I am back in a week or so.

Thank you again, Yilanst for all your hard work and helping to provide this awesome piece of software to the community. I'm always impressed by your responsiveness and ability to troubleshoot this stuff with the limited information your given.

noagenda33 commented 2 years ago

Where can I find the step by step instructions for importing a vPro cert? I'm totally lost. Thank you.

noagenda33 commented 2 years ago

I just tried activating an AMT machine without any certs and it still works. Do you even need a cert?