opensearch-project / anomaly-detection-dashboards-plugin

Manage your detectors and identify atypical data in OpenSearch Dashboards
https://opensearch.org/docs/latest/monitoring-plugins/ad/index/
Apache License 2.0
27 stars 55 forks source link

[BUG] Unable to edit existing Anomality Detectors after changing Authentication to SAML #438

Open lehmat opened 1 year ago

lehmat commented 1 year ago

Describe the bug Changing authentication from internal users to SAML authentication causes existing (created with internal user, that is now deleted) anomality detectors to be not editable with administrator access.

Details:

PUT /_dashboards/api/alerting/monitors/id-of-detector?ifSeqNo=1&ifPrimaryTerm=59 -> Response status code 200 -> Response content: {"ok":false,"resp":"[alerting_exception] User has no available detectors"}

To Reproduce Steps to reproduce the behavior:

  1. Change authentication method to SAML (Azure in this case), Log in with new user with pre-existing anomality detectors
  2. Click on Anomaly detection on sidebar
  3. Click on Detectors
  4. Click on one of Detectors created
  5. Anomality overview click on Edit alert settings
  6. Click on Edit
  7. Do not make any changes and click on Update at the bottom

Expected behavior Should be able to successfully edit

Plugins SAML Authentication from Opensearch AWS console enabled

Host/Environment (please complete the following information):

kavilla commented 1 year ago

Issue moved to opensearch-project/security-dashboards-plugin opensearch-project/OpenSearch-Dashboards#1377 via Zenhub

peternied commented 1 year ago

@lehmat We would need a more information to know if there is a bug. AD uses the username to populate the ownership field on the detector, if the internal name != the saml name this would be expected behavior.

If they are the same, then we would need a copy of the detector document and GET _plugins/_security/api/account result to understand the nature of the bug.

lehmat commented 1 year ago

Hi,

I can tell this without querying the API. The internal username used to be shortname (for example lehmat, which was used to create the detectors) and AD is an email address, which is used to try to edit the detector

Is there any steps I could take to manually fix them or is recreating them only option?

peternied commented 1 year ago

I'm going to transfer this issue to the repo of the plugin they might be able to give you advice on how to make the change.

amitgalitz commented 1 year ago

Hi @lehmat,

I just wanted to clarify more regarding this. I see you mentioned in the “Details” section that you get an error from alerting that the user has no available detectors. You also mention that you can’t edit your current detectors. Are these two different errors you are experiencing? I wasn’t clear if you aren’t able to see the detectors at all or you aren’t able to edit them.



Furthermore in terms of detector ownership. The way we declare which user can access what detectors is through backend role filtering. Enabling SAML could change the backend roles applied to the user in this case. Two different users that have full access to detectors with the same backend role should be able to view detectors created by the other.

lehmat commented 1 year ago

Hi @amitgalitz ,

The "Details" section gives the technical code that I get trying to edit a detector that is visible in the UI. I can see same detectors as with internal user, but all actions against them causes the "Details" sections error (I looked it up in network tab under developer tools)

About the backend roles - is there an api endpoint that I could call to see which backend role is assigned to specific detector? I can report back when I know which role to add. My gut feeling is that they do not have same roles (cant check from UI since internal user is long gone)

amitgalitz commented 1 year ago

To get all the information related to the user backend role you can use the get detector API for that specific detector (this gets all detector info including the ownership related information): GET _plugins/_anomaly_detection/detectors/<detector-id>

To do a search on all detectors you can use GET _plugins/_anomaly_detection/detectors/_search with a body including a search query like:

{
  "query": {
    "match_all": {}
  }
}
lehmat commented 1 year ago

Added the roles I could find and no change.

However, I realized this is not about detectors. I managed to edit the detector, but not the alert associated with it. (Look at steps to reproduce: Edit alert settings and "Details" API endpoint). So the topic of the issue is misleading, it is error editing alert that is created through detector.

amitgalitz commented 1 year ago

That's a good point, I missed that you went to edit alert settings and this is where you got the issue. In that case it might be a similar logic but with the alerting monitor itself. Once you created an alerting monitor the logic regarding ownership of that monitor is separate from the detector itself. The monitor simply sits on top of the detector and scans it for relative to the alerting logic configured.

I believe alerting has similar backend role logic. I will reach out for confirmation on this and steps to get monitor backend role if that is indeed issue.

eirsep commented 1 year ago

@lehmat are you able to edit the alerting settings as Admin user after converting to SAML?

lehmat commented 1 year ago

My user that I have been using is administrator and can access every other parts of the software. Or are you referring to the opensearch-master that is left as only internal user?

lezzago commented 1 year ago

@lehmat, do the user have matching backend roles as the AD detector? If not, it will fail even as admin.

@amitgalitz, there needs to be a code change here to allow admins to access all detectors. Additionally changes might also be needed here to take care of the admin case.

lehmat commented 1 year ago

@lehmat, do the user have matching backend roles as the AD detector? If not, it will fail even as admin.

@lezzago I looked at the endpoints given in this ticket and looked up the roles in the response. Added them to my user and retried the operation - no change.

Is there any PUT commands I could do to fix them pointing at my new user? The code changes to enable admin to edit them seems legit, but given AWS is lagging already multiple minor versions behind - I should not wait for the fix to be available

amitgalitz commented 1 year ago

@lehmat Just to reiterate are you able to edit the detectors on there own with no problem? (Like changing the detector settings themselves not alerting)? And also you are able to edit a monitor not related to AD from the alerting plugin?

Problem is only when you try to edit alerting from the detector details page? @lezzago From my understanding if the user backend role match for the detector and they can even access the detector details page itself then you are saying the issue is when we are making this call we aren't getting the detectors for some reason? If we are using the same backend roles and user info, I am not really understanding why its suddenly failing.

lezzago commented 1 year ago

@amitgalitz I am assuming here that AD has logic to allow admins to see all detectors regardless of their backend role. If that is the case, the user might not have the same backend roles and the pieces of code I shared is causing that problem.

amitgalitz commented 1 year ago

@lezzago we actually prioritize backend role over admin access in AD. If security is enabled and filter by backend role is enabled we filter by backend role right away. Here is where we check this and then here we add it to the search request.

amitgalitz commented 1 year ago

We might actually want to fix it on AD and alerting so we allow admin access to everything no matter the backend roles. I am still a little unsure though how we can can access the Anomaly Detector but at the same time when trying to edit the monitor we get the "no detectors found error".

amitgalitz commented 1 year ago

Hi @lehmat could you possible paste the backend roles of the monitor, the detector and the user. Can obviously be redacted but just so we have a better understanding of what the situation is to better understand if its possible to have an API workaround.

lehmat commented 1 year ago

@amitgalitz Sure, but I might need help to get what you exactly need. Here is the ones I could think of being useful API call:

GET _dashboards/api/alerting/detectors/sZofY4UB_zoY8BXQij1K

{
    "ok": true,
    "detector": {
        "name": "Cloudtrail",
        "description": "",
        "time_field": "@timestamp",
        "indices": ["log-aws-cloudtrail*"],
        "filter_query": { "match_all": { "boost": 1 } },
        "detection_interval": { "period": { "interval": 5, "unit": "Minutes" } },
        "window_delay": { "period": { "interval": 1, "unit": "Minutes" } },
        "shingle_size": 8,
        "schema_version": 0,
        "feature_attributes": [{ "feature_id": "sJofY4UB_zoY8BXQiT1K", "feature_name": "Region", "feature_enabled": true, "aggregation_query": { "region": { "value_count": { "field": "errorCode.keyword" } } } }],
        "ui_metadata": { "features": { "Region": { "aggregationBy": "value_count", "aggregationOf": "errorCode.keyword", "featureType": "simple_aggs" } }, "filters": [] },
        "last_update_time": 1673445398914,
        "user": { "name": "<previous-localuser-that-is-deleted>", "backend_roles": [], "roles": ["security_manager", "all_access"], "custom_attribute_names": [], "user_requested_tenant": "Security" },
        "detector_type": "SINGLE_ENTITY"
    },
    "version": 4,
    "seqNo": 3,
    "primaryTerm": 3
}

AND trying to edit this one:

PUT _dashboards/api/alerting/monitors/sZofY4UB_zoY8BXQij1K?ifSeqNo=1&ifPrimaryTerm=59

{"ok":false,"resp":"[alerting_exception] User has no available detectors"}

My current user info:

GET _dashboards/api/v1/auth/authinfo

{
    "user": "User [name=ad.user@company.com, backend_roles=[c31a6137-01f3-4a76-bd2b-f18a956f654d], requestedTenant=Security]",
    "user_name": "ad.user@company.com",
    "user_requested_tenant": "Security",
    "remote_address": "127.0.0.1:33582",
    "backend_roles": ["c31a6137-01f3-4a76-bd2b-f18a956f654d"],
    "custom_attribute_names": ["attr.jwt.roles", "attr.jwt.sub", "attr.jwt.saml_nif", "attr.jwt.nbf", "attr.jwt.exp", "attr.jwt.saml_si"],
    "roles": ["security_manager", "all_access"],
    "tenants": { "Sales": true, "Security": true, "global_tenant": true, "ad.user@company.com": true },
    "principal": null,
    "peer_certificates": "0",
    "sso_logout_url": "https://login.microsoftonline.com/xxxxxx/saml2?SAMLRequest=........"
}
lezzago commented 1 year ago

@lehmat, do you have this setting enabled: plugins.anomaly_detection.filter_by_backend_roles?

@amitgalitz, this code does not check if the user has the backend role setting enabled for AD. In here, Alerting should call AD if it can access that detector though this might cause more dependency.

lehmat commented 1 year ago

@lehmat, do you have this setting enabled: plugins.anomaly_detection.filter_by_backend_roles?

How do I check this? I have not installed it myself, rather using AWS managed service

lezzago commented 1 year ago

@lehmat, do you have this setting enabled: plugins.anomaly_detection.filter_by_backend_roles?

How do I check this? I have not installed it myself, rather using AWS managed service

You can run this in dev tools: GET _cluster/settings?include_defaults=true

lehmat commented 1 year ago

It is false

amitgalitz commented 1 year ago

@lehmat It seems like the issue here has two parts that we will work on fixing.

  1. In alerting code we don't check if backend role filtering is enabled for detectors.
  2. We don't prioritize admin access over non-matching backend roles.

However for an immediate fix we have 3 options:

  1. If you are a managed service user on AWS you can create a ticket there and we can help change the detector configurations to have a backend role matching the one your user has.
  2. If this is your own cluster, you should have super admin access and you can change the anomaly-detector config index to add the backend role
  3. You can delete the detectors and create new ones with the wanted user.
amitgalitz commented 1 year ago

@amitgalitz, this code does not check if the user has the backend role setting enabled for AD. In here, Alerting should call AD if it can access that detector though this might cause more dependency.

@lezzago that is a good find, I'll make sure to fix this as well as prioritizing admin access in both AD and alerting over backend role.

lehmat commented 1 year ago

I think the option 3. Is the fastest since there is only 5 detectors that are simple. It only takes 30minutes to redo them.

It is good that the underlying issue is found and a fix is being prepared. If you need any of my help be sure to ping and I'll do my best to answer it.