Closed lukeheath closed 1 year ago
@lukeheath I think this one might require extra thoughts/specs, looking at the docs:
An HTTP 400 Bad Request error indicates one of the following:
- Unsupported OAuth parameters
- Unsupported signature method
- Missing required authorization parameter
- Duplicated OAuth protocol parameter
An HTTP 401 Unauthorized error indicates one of the following:
- Invalid consumer key
- Invalid or expired token
- Invalid signature
- Invalid or already-used nonce
An HTTP 403 Forbidden error indicates one of the following:
- The MDM server, or the MDM server's consumer key/token does not have access to perform the specific request. In this case, the request body contains ACCESS_DENIED.
- The organization has not accepted the latest terms and conditions of the program. In this case, the request body contains T_C_NOT_SIGNED.
In light of that:
403
s are at least partially handled because of https://github.com/fleetdm/fleet/issues/8537400
might happen because the fleet server is actually doing something wrong403
s? otherwise without an alarm raising (a 500
error) and an user message, the issue will fly under the radar.@roperzh Thanks for the info! Yes, we'll need to determine how to surface this to the user since this API interaction is driven by an environment variable.
@noahtalerman would you please take a look and let us know how we should notify the user in this case? Thanks!
Hey @lukeheath I think I would expect the Fleet server to fail to start and show an error if the ABM key/cert is invalid.
cc @mna
Good idea! @mna Let's go with validating the key/certs on start and failing to start the server if they are invalid.
@noahtalerman @lukeheath We had discussed doing it at startup when we first implemented it, but there were some concerns (in particular, extra time required at startup before the Fleet instances report as "healthy", and the possibility of the Apple API being down, preventing startup). The discussion at that time: https://github.com/fleetdm/fleet/issues/8725#issuecomment-1332294940
A few options I can think of:
fleet serve
initialization logic, though.Even if we do run the validation at startup, there's still the risk that Apple's API could be down at that moment and that could prevent starting the instance (and there's the off-chance that Apple invalidates a cert at any point for some reason). Of course this is not something that should be super common, but even at 5 nines availability (which is unlikely to be that high), that's about a minute every 2 months. We could differentiate between an "invalid cert" API response and an unreachable/Apple's server error, and keep going if Apple is unreachable (instead of preventing startup), but we could then find out later on that the cert is actually invalid.
Being able to handle that dynamically (i.e. at runtime, not at startup) is more robust and flexible, would handle all curveballs that a third party API could throw at us, but is probably a bit more involved (though not that much, as mentioned we have similar logic for the expiration). A step further would be to design a "notification center" for all such "background failures" that should bubble up to the user, but that's of course a whole new feature and quite a bit more work.
That being said, for the api/latest/fleet/mdm/apple_bm
endpoint itself, we can definitely handle an error from the call to Apple's API and return a 4xx instead of 500 when we can detect that it's due to the cert configuration.
@roperzh @lukeheath
For the cases where is the user's fault, should we display a banner like we do for 403s? otherwise without an alarm raising (a 500 error) and an user message, the issue will fly under the radar.
Wouldn't the frontend display the usual "request failed" error with the details from the response's payload if the server endpoint returned a 400? If not would it be a big change to make it do that? My understanding is that at the moment, the error is not displayed to the user (and only visible via the browser's developer tools) due to being a 500.
@noahtalerman @lukeheath A recap on this ticket and the remaining decisions to take now that I created the PR to change the status code of that API endpoint to a 400 if the ABM certificate/token is invalid:
Something like that for the error message: "The Apple Business Manager certificate or server token is invalid. Restart Fleet with a valid certificate and token. See https://fleetdm.com/docs/using-fleet/mdm-setup#apple-business-manager-abm for help."
In addition to step 1, keeping in mind the concerns about delaying startup of Fleet, we could validate the cert/token with Apple's API. Note that we already validate that the token can be decrypted with the certificate, and that it is not expired, so it is unlikely that the provided cert/token are invalid! The most likely reason that it would be invalid is if the user downloaded a new ABM token but provided the old one (Apple invalidates the old one when a new one is downloaded). Another possibility is that the token expired since startup. Both of those cases can happen after startup and as such, validating this at startup is not sufficient.
In addition to step 1 and independently of whether we implement step 2, we could also display the banner whenever we detect that the certificate has become invalid. This adds the benefit that the user doesn't have to navigate to the "Settings->Integrations->Automatic enrollment" page to see that the cert/token is invalid.
The PR currently implements 1. only (and still requires frontend changes and adding the user-friendly message).
@mna Thanks for the update!
@georgekarrv @noahtalerman Would y'all please sync up with Martin on this today and make sure we're aligned on how to implement a fix? Thanks!
We discussed this during standup, the decision is that:
Once https://github.com/fleetdm/fleet/pull/11899 gets merged this ticket can be closed (after QA of course).
@mna I believe I'm sending invalid files used for authentication in order to test this fix, but I'm still receiving a 200 status result.
--mdm_apple_bm_cert
and --mdm_apple_bm_key
values on Fleet startup to contain modified versions of the correct files. To do this I opened the files up in vim
and modified some of the characters for the key and crt values.api/latest/fleet/mdm/apple_bm
endpoint with postman.In both cases I'm receiving a 200 response. It seems as if I am modifying the wrong file or I need to trigger an update to Apple BM with the invalid keys?
@xpkoala this is a bit more involved because if you just provide invalid cert/key or token files, it will be validated at startup as invalid (e.g. cannot be parsed as a valid cert/key or cannot decrypt the token) and Fleet should not even start. This is what I get when I modify my certificate by changing random chars:
Failed to start: validate Apple BM token, certificate and key: Apple BM configuration: parse key pair: x509: malformed signature
and if I change the token with invalid chars:
Failed to start: validate Apple BM token, certificate and key: Apple BM configuration: decrypt token: ber2der: BER tag length too long
So this technique cannot be used to verify that the endpoint does return 400 (you shouldn't have been able to even start Fleet with those invalid files - at least I'm not).
The only way I can think of, as the files have to be otherwise valid to decrypt the token and the token must be valid json too, is to start fleet with valid files, then go to Apple's Business Manager website and download a new token file:
Making a request to the apple_bm endpoint should result in this (I tested by running fleetctl get mdm-apple-bm
which calls this endpoint):
could not get Apple BM information: GET /api/latest/fleet/mdm/apple_bm received status 400 Bad request: Get "https://mdmenrollment.apple.com/account": DEP auth error: 400 Bad Request: oauth_problem_adviceBad Request
@xpkoala let me know if you need any help/clarification for that. Happy to jump on a call too.
Invalid key, error, Nature's calm teaches right path, Fleet finds solutions.
Fleet version: (head to the "My account" page in the Fleet UI or run
fleetctl --version
)main: 4fdf640
Operating system: (e.g. macOS 11.2.3) MacOS
Web browser: (e.g. Chrome 88.0.4324) Chrome
🧑💻 Expected behavior
I expect to receive a 400 Bad Request
💥 Actual behavior
I receive a 500 Server Error with the following payload:
👣 Reproduction steps
/fleet/mdm/apple_bm
More info