elastic / uptime

This project includes resources and general issue tracking for the Elastic Uptime solution
12 stars 3 forks source link

Private monitor management locations MVP #475

Closed andrewvc closed 2 years ago

andrewvc commented 2 years ago

This takes over for https://github.com/elastic/uptime/issues/441 which has grown long and unwieldy.

This issue defines ACs for an initial implementation of private locations for monitor management in the Uptime app. The locations are to be implemented atop fleet continuing the work in @shahzad31 's POC

ACs

Design

image

Flyout

Flyout is 540px wide.

When private locations exist image

When no locations exist image

Add location form image

Condition when no agent policies exist image

Uptime

Monitor Management

By using this feature, Customer acknowledges that it has read and agrees to Elastic's Beta Release Terms https://www.elastic.co/agreements/beta-release-terms 

There is no cost for the use of the service to execute your tests during the beta period. A fair usage policy will apply. Normal data storage costs apply for test results stored in your Elastic cluster.

This should be added to the following screen, below the existing text

mm-stage2

Fleet UI

dominiqueclarke commented 2 years ago

For users entering through the Elastic Synthetics page, should we disable the ability to add integrations directly and navigate them to MM?

Run once functionality will be disabled for monitors that only have a private location assigned. (Either hide the component, or gray out with tooltip explaining reason)

Test now functionality should also be disabled.

hbharding commented 2 years ago

Update

I added some designs for the Private location flyout to the issue description based on a session with @shahzad31 and @dominiqueclarke. I made these pretty quickly based on Shahzad's work. I'm hoping these are relatively small improvements we can get in, but I certainly don't want to block this feature from getting into the next release. Let me know if there are any areas we can simplify.

Other feedback


image

Remove the map marker icon and just say "Private locations".


image

Change text to "This policy is managed externally", or even better / if possible, "This policy can be managed in the Uptime application". Also, use the tooltips content prop rather than the title prop


image

I understand we can only control the bottom portion of this page, but as is, this is pretty confusing and it feels like there is a lot of unnecessary content on the page. If possible, can we use a single callout with a link to Uptime, and not show any of the form inputs since they are not editable?

dominiqueclarke commented 2 years ago

Setup

Testing this feature requires Fleet server and elastic-agent-complete running in Docker.

You can choose to either set up Fleet server and elastic-agent manually or automated using elastic-package. Both methods require a bit of investment upfront to set up.

Via elastic-package

  1. Run elastic-package stack up --version 8.4.0-SNAPSHOT -v
  2. Once the stack is up, stop kibana using docker desktop management
  3. Add the following to your local kibana.dev.yml
    elasticsearch:
    hosts: ["https://localhost:9200"]
    username: "kibana_system"
    password: "changeme"
    xpack.fleet.registryUrl: https://localhost:8080
  4. Run curl -k -u elastic:changeme -X POST "https://localhost:9200/_security/user/kibana_system/_password?pretty" -H 'Content-Type: application/json' -d' { "password" : "changeme" } '
  5. Run eval "$(elastic-package stack shellinit)"
  6. Run export NODE_EXTRA_CA_CERTS=${ELASTIC_PACKAGE_CA_CERT}.
  7. Start your local Kibana. Fleet server and agent should now be configured correctly. You can confirm by visiting the fleet page

Via manual setup

  1. In Kibana, navigate to the Fleet page and follow the directions to add Fleet server
  2. Create a new agent policy from fleet
  3. Under the enrollment tab, generate a new enrollment token for the new policy you created
  4. Enroll elastic-agent in that policy. Run docker run --env FLEET_ENROLL=1 --env FLEET_INSECURE=1 --env FLEET_ENROLLMENT_TOKEN=<TOKEN> --env FLEET_URL=<URL> docker.elastic.co/beats/elastic-agent-complete:8.4.0-SNAPSHOT -d "*" -e

Testing

Note: All testing should be done on the 8.4 branch of Kibana, or, where appropriate, against an 8.4.0-SNAPSHOT in cloud first test

This feature has diverging UX depending on the user's permissions. To test all code paths, you will need two users with two different roles. Below are the required users and associated permissions. Create two users with the following permissions:

Fleet user
Kibana permission Fleet all
Kibana permission Integrations all
Kibana permissions Uptime all
Cluster permission manage_own_api_keys
Index permission synthetics-* all
Uptime user
Kibana permissions Uptime all
Cluster permission manage_own_api_keys
Index permission synthetics-* all

Fleet user - Cloud

Setup:

  1. Add the following keys to Kibana.yml. This will cause Kibana to run under a cloud context: https://p.elstc.co/paste/4evUZg9W#RQNLl3iVTsPUVpfPlhH8U4CWM007ONUcgIr+VXWm9Aa
When I first visit Monitor Management I see an enabled prompt with legal text implying that my use means acceptance of T&C
When I first enable Monitor Management I see an empty monitor list, the Add Monitor button is enabled, and the Private Locations button is Visible
When I click on the Private Locations button A flyout appears with a prompt to add my first location
When I click add location if I have existing agent policies I am prompted to add the location name and select an agent
When I click add location if I don't have existing agent policies I see a prompt directing me to add an agent by redirecting to Fleet
When I add a location It's added to the locations list I cannot edit it
When I click Add monitor I see the newly added location in the locations list It has a Private label
When I have a monitor configured for a private location I cannot delete that private location until all monitors for that location are removed
When I have no monitors for a given location I can delete that private location
When I save a monitor with a private location The data should appear in Uptime overview list, and the location for the check should be marked with the correct location name
When I visit the associated Agent Policy page in Fleet for a given location Any integrations made from private locations are not able to be deleted
When I visit the associated Agent Policy page in Fleet for a given location When I click edit integration I'm redirected to a read-only view where I can navigate to monitor management.
When I create monitors with the same name in different spaces I do not get any errors
When I delete an agent policy tied to a private location My private location displays as invalid I can edit existing monitors to remove the invalid location

Uptime only user - Cloud

Setup:

  1. Add the following keys to Kibana.yml. This will cause Kibana to run under a cloud context: https://p.elstc.co/paste/4evUZg9W#RQNLl3iVTsPUVpfPlhH8U4CWM007ONUcgIr+VXWm9Aa

Prerequisites

When I click on the Private Locations button A permissions disclaimer is displayed I cannot add or delete private locations
When I click Add monitor I see public and private locations Private locations are disabled
When I visit the monitor list the edit and delete buttons are disabled for monitors with private locations the edit and delete buttons are enabled for monitors with public locations
When I click the api key button I see a prompt notifying me that my permissions are insufficient to use private locations I can still create an api key

Fleet user - On Prem

Setup

  1. REMOVE the following keys to Kibana.yml. This will cause Kibana to run under an on-prem context. Ensure no other x-pack.uptime.service keys are defined https://p.elstc.co/paste/4evUZg9W#RQNLl3iVTsPUVpfPlhH8U4CWM007ONUcgIr+VXWm9Aa
When I first visit Monitor Management I see an enabled prompt with legal text implying that my use means accepted of T&C
When I first enable Monitor Management I see a prompt directing me to create my first location
When I don't have any private locations The add monitor button is disabled
When I click on the Private Locations button A flyout appears with a prompt to add my first location
When I click add location if I have existing agent policies I am prompted to add the location name and select an agent
When I click add location if I don't have existing agent policies I see a prompt directing me to add an agent by redirecting to Fleet
When I add a location It's added to the locations list I cannot edit it
When I click Add monitor I see the newly added location in the locations list It has a Private label
When I have a monitor configured for a private location I cannot delete that private location until all monitors for that location are removed
When I have no monitors for a given location I can delete that private location
When I save a monitor with a private location The data should appear in Uptime overview list, and the location for the check should be marked with the correct location name
When I visit the associated Agent Policy page in Fleet for a given location Any integrations made from private locations are not able to be deleted
When I visit the associated Agent Policy page in Fleet for a given location When I click edit integration I'm redirected to a read-only view where I can navigate to monitor management.
When I create monitors with the same name in different spaces I do not get any errors
When I delete an agent policy tied to a private location My private location displays as invalid I can edit existing monitors to remove the invalid location

Uptime only user - On Prem

Setup Setup

  1. REMOVE the following keys to Kibana.yml. This will cause Kibana to run under an on-prem context. Ensure no other x-pack.uptime.service keys are defined https://p.elstc.co/paste/4evUZg9W#RQNLl3iVTsPUVpfPlhH8U4CWM007ONUcgIr+VXWm9Aa

Prerequisites

When I click on the Private Locations button A permissions disclaimer is displayed I cannot add or delete private locations
When I visit the monitor list the Add Monitor Button is Disabled
When I visit the monitor list the edit and delete buttons are disabled
When I click the api key button I see a prompt notifying me that my permissions are insufficient to use private locations I can still create an api key

Fleet user - Project Monitors

Setup

  1. In Kibana, sign into the Fleet user and generate an api key from the API Keys button. Note: (In order to use private locations, you must generate the api key from a user with Fleet permissions, so ensure you create a new api key instead of using an old one)
  2. In Kibana, ensure you have private locations configured by clicking the Private Locations button and adding a location
  3. Check out the main branch of the synthetics repo.
  4. Run npm run build
  5. cd ./examples/todos.

Example command (You can assign private locations by the location's name)

node ../../dist/cli.js push  --url [YOUR_KIBANA_URL] --project test-project --auth [YOUR_API_KEY] --schedule 3 --privateLocations "[YOUR LOCATION NAME]"

Example dsl (You can assign private locations by the location's name)

journey('check if input placeholder is correct', ({ page, params }) => {
  monitor.use({
    schedule: 5,
    privateLocations: ["YOUR LOCATION NAME"]
  })
  step('launch app', async () => {
    await page.goto(params.url);
  });
});
In Kibana
When I click on the API key button in Kibana I can create an API key with Uptime and Fleet permissions
From the command line
When I define a default private location via --privateLocations and do not define privateLocations via monitor.use The monitor is created with that location
When I define a private location via monitor.use({ privateLocations: [...] }) The monitor is created with that location
When I update a monitor that has an assigned private location The monitor is updated The associated integration policy is updated
When I delete a location by removing it from monitor.use re-push The monitor is updated to remove that location The associated Integration policy is deleted
When I delete a monitor and re-push The monitor The associated Integration policy is deleted

Uptime user - Project Monitors

Setup

  1. In Kibana, sign into the Uptime user and generate an api key from the API Keys button. Note: (This api key will have limited permissions, causing errors)
  2. In Kibana, ensure you have private locations configured by clicking the Private Locations button and adding a location
  3. Check out the main branch of the synthetics repo.
  4. Run npm run build
  5. cd ./examples/todos.

Example command (You can assign private locations by the location's name)

node ../../dist/cli.js push  --url [YOUR_KIBANA_URL] --project test-project --auth [YOUR_API_KEY] --schedule 3 --privateLocations "[YOUR LOCATION NAME]"

Example dsl (You can assign private locations by the location's name)

journey('check if input placeholder is correct', ({ page, params }) => {
  monitor.use({
    schedule: 5,
    privateLocations: ["YOUR LOCATION NAME"]
  })
  step('launch app', async () => {
    await page.goto(params.url);
  });
});
In Kibana
When I click on the API key button in Kibana I see a disclaimer stating that I cannot use private locations I can still create an api key
From the command line
When I define a default private location via --privateLocations and do not define privateLocations via monitor.use and push I receive an error stating that I do not have permissions
When I define a private location via monitor.use({ privateLocations: [...] }) and push I receive an error stating that I do not have permissions
When I attempt to update a monitor that has an assigned private location I receive an error stating that I do not have permissions
When I delete a location by removing it from monitor.use re-push I receive an error stating that I do not have permissions
When I delete a monitor and re-push I receive an error stating that I do not have permissions
lucasfcosta commented 2 years ago

It took me quite a bit of time to set things up due to https://github.com/elastic/synthetics-service/issues/685. So I had to fix that locally before proceeding. Also, setting up a remote OBLT cluster was not possible because of the SNAPSHOT version there.

I ended up testing this locally against the elastic stack with the change which allows the service to disable SSL on dev mode.

Will post feedback for the other flows in a bit.

(click to links to see images)

Fleet user

  1. Created the role and user (Role, Role, User)
  2. First visit displays the correct page with links to docs and ToS (TOS)
  3. Enabling Monitor Management as an admin shows the list with the add button enabled for the Fleet user (Enabled Button and List
  4. I can run the monitors I create and save them. (Running monitors)
  5. I can open the flyout to create a private location and buttons are enabled. (Private Locations Flyout)
  6. Once I have a policy, I cannot add the private location. When clicking save, nothing happens. There are no error logs in Kibana backend. (Flyout Overview)
  7. After some debugging I discovered it's because of the beta value for the dev manifest location. It needs to be ga or experimental instead. To solve it I used my own local manifest.

    Please notice that the same problem happens if Kibana can't fetch locations.

    (Error related to location being beta)

    After that, I was still not able to save a location and never saw any errors.

    In the Kibana console, the only error message I see is:

    [2022-07-27T15:51:30.870+01:00][ERROR][savedobjects-service.repository.point-in-time-finder] Failed to open PIT for types [tag]

Feedback

Step 5: It's not immediately obvious what the user should do because there are two primary buttons. Should they click the one on the bottom or the one on the top? This is confusing IMO. Step 6: The save button should be capitalized IMO. Step 6/7: The flyout is not closed if I click out of it in the stage with the form for a private location. Instead, it goes to the previous state. IMO it should close completely when clicking outside of it. See this video to understand.

dominiqueclarke commented 2 years ago

@lucasfcosta Please see https://github.com/elastic/kibana/pull/137526 for the fix to the critical issue. It isn't related to the issue with the beta status label at all, though I will ping you seperately about that

Feedback Step 5: It's not immediately obvious what the user should do because there are two primary buttons. Should they click the one on the bottom or the one on the top? This is confusing IMO.

cc: @hbharding for final decision

Step 6: The save button should be capitalized IMO.

Done

Step 6/7: The flyout is not closed if I click out of it in the stage with the form for a private location. Instead, it goes to the previous state. IMO it should close completely when clicking outside of it. See this video to understand.

I haven't addressed this as of yet. @hbharding what do you think?

emilioalvap commented 2 years ago

@dominiqueclarke I'm in the process of running on-prem checks, just wanted to make sure, fleet user should not have permission to enable Monitor Management right? There's a explanatory note for uptime user but nor for fleet, so I'm not sure what the expectation is here.

(Ignore this comment, it's the same @lucasfcosta posted above) A minor thing, if the user selects cancel or close the flyout when creating a private location, it always redirects to private locations flyout: rec

emilioalvap commented 2 years ago

@dominiqueclarke qq about lightweight checks. I've noticed ICMP and TCP checks are displaying Fleet managed as locations in the result page, is this intended?

HTTP vs. ICMP

emilioalvap commented 2 years ago

About these two:

When | I have a monitor configured for a private location | I cannot delete that private location until all monitors for that location are removed |   -- | -- | -- | -- When | I have no monitors for a given location | I can delete that private location

Found a race condition where if Kibana takes long to respond, it's actually possible to remove a private location with monitors configured. They appear as undefined after that

Here's the recording

dominiqueclarke commented 2 years ago

@dominiqueclarke qq about lightweight checks. I've noticed ICMP and TCP checks are displaying Fleet managed as locations in the result page, is this intended?

HTTP vs. ICMP

@emilioalvap That is not intended. Good catch! Please pull down the fix from this PR and retest. Testing instructions included https://github.com/elastic/integrations/pull/3925

lucasfcosta commented 2 years ago

Thanks for the updates @dominiqueclarke! I've just tried adding a location and then a monitor and it's now working for me!

Fleet user on Cloud context

Images:

Questions

Uptime user on Cloud context

Images:


On prem test is in progress, just thought I'd post this first to accelerate feedback.

dominiqueclarke commented 2 years ago

Should the "run once" be disabled if I only selected the private location?

Yes, we do not currently support run once for private locations

This field is disabled when editing the policy, but it doesn't look disabled. Is this expected?

Unfortunately, this is where we are at for MVP. We don't have as much control over those two fields, as they are controlled by the Fleet UI codebase. In the future, we can make improvements, but for MVP we ran out of time to improve this through contributions on the Fleet side.

The button to edit the integration in uptime actually brings users to the list of monitors not to that particular monitor's edit page, is that expected? If possible it would be cool to point to the exact monitor edit page.

I agree. This has been discussed, and again came down to running out of time. Might be able to get it in as a bug fix, but this change wasn't prioritized.

You mentioned deleting a policy would make a location invalid, but I couldn't find any way of deleting the policy. How should I do that? I also tried with the superuser and couldn't find any way of doing so. It does seem like the docs for that are out of date: https://www.elastic.co/guide/en/fleet/master/edit-or-delete-integration-policy.html. See image.

On the agent policy page on the Settings tab, there should be a Delete button at the bottom. You may need to unenroll the agent attached to the policy first.

Screen Shot 2022-08-02 at 10 10 53 AM Screen Shot 2022-08-02 at 10 10 44 AM

I was actually able to generate an API key, but I do see the warning. Can we disable the generation if the user doesn't have permissions to do what they must instead of having the warning? If not, can we highlight the warning somehow?

@lucasfcosta , we don't want to disable generation, particularly on cloud, because we don't know if the user intends to use private locations or not. Highlighting the warning better is a good idea. @hbharding thoughts? Disabling it entirely when the user does not have fleet permissions on-prem would make sense.

When the user doesn't have the proper permissions for UI monitors with private locations, they aren't able to add them at all from the UI. However, with project monitors, if the user tries to create a project monitor with a private location via push with a limited permission API key, they will receive a helpful error message reported back in the CLI.

lucasfcosta commented 2 years ago

On Prem Feedback

Fleet user

  1. Add monitor button is disabled as expected with the prompt for new location
  2. I can save a new location just fine
  3. I can create a monitor just fine
  4. I can see a monitor's private location when creating it
  5. Monitor results appear
  6. I can only delete a priv. location after deleting its monitor.
  7. Not being able to delete an existing policy
  8. Can't edit the policy either
  9. Deleting the policy makes the location non-deletable and shows the correct message

Uptime on Prem

  1. Disclaimer on API key appears
  2. Disclaimer and disabled actions on the private locations flyout

Questions

lucasfcosta commented 2 years ago

Project monitors

On cloud with fleet

  1. I was able to push a project monitor succesfully to my private location. Here's the pushed monitor.
  2. I can see my monitor's results
  3. I can re-push a monitor using the --privateLocation flag and it works.
  4. The monitor that was not re-pushed is not there anymore (the old integration policy also disappeared)

On cloud with uptime user

  1. I get the appropriate error when using the uptime user's keys no matter what I do
dominiqueclarke commented 2 years ago

I do see the add monitor button enabled and can use it (although edit and delete are correctly disabled). I thought it should've been disabled?

@lucasfcosta I'm assuming this is for an Uptime only user. Did you remove this key in your Kibana.yml file? xpack.cloud.id: something

lucasfcosta commented 2 years ago

@dominiqueclarke that was it, thank you! Removing it did disable the button. Everything else seems fine still. Thank you very much!

lucasfcosta commented 2 years ago

Closing this one and actually moving straight to "Done Done" as recommended by @shahzad31 on Slack.