elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
220 stars 453 forks source link

[Azure] [Docs] extend the storage account requirements in the Azure Logs documentation #7586

Closed zmoog closed 11 months ago

zmoog commented 1 year ago

Some users reported they need to learn how to configure the storage account required to run the Azure module or the Azure Logs integration.

For example, users are looking for guidance on storage account settings like:

We must add this information to the current documentation.

Here's a list of the current version for Filebeat and Agent:

zmoog commented 1 year ago

The azure-eventhub input uses a storage account as a checkpoint store for the Event Processor Host (EPH). See https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-event-processor-host for more.

Recommendations:

Don't enable the soft delete feature on the storage account that's used as a checkpoint store. Don't use a hierarchical storage (Azure Data Lake Storage Gen 2) as a checkpoint store.

zmoog commented 1 year ago

Also, from https://learn.microsoft.com/en-us/azure/event-hubs/event-processor-balance-partition-load#checkpoint (this doc is for the newer Azure SDK, but they may apply to the legacy checkpointing implementation):

On the Storage account page in the Azure portal, in the Blob service section, ensure that the following settings are disabled.

  • Hierarchical namespace
  • Blob soft delete
  • Versioning
zmoog commented 1 year ago

As a starting point, here's the recommended default settings:

Use the storage account as checkpoint store only.

TomonoriSoejima commented 1 year ago

In addition, please also consider adding required port based on https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-faq#what-ports-do-i-need-to-open-on-the-firewall

alaudazzi commented 1 year ago

@zmoog and I met on Nov 22 and 23. Meeting notes:

Image

alaudazzi commented 1 year ago

@zmoog Which Azure integrations are impacted by these doc changes?

zmoog commented 1 year ago

Here is the list of integrations that the page https://docs.elastic.co/integrations/azure helps to set up:

$ tree -L 1  packages/azure/data_stream/
packages/azure/data_stream/
├── activitylogs
├── application_gateway
├── auditlogs
├── eventhub
├── firewall_logs
├── identity_protection
├── platformlogs
├── provisioning
├── signinlogs
└── springcloudlogs

10 directories, 0 files
alaudazzi commented 1 year ago

@zmoog thank you again for you great intro to Azure integrations in general, and for providing me with access to the Azure portal to test and update the doc procedure.

Here is the doc fix for the section on storage account settings. This setup procedure is documented in the main Azure Logs page, but there are 10 Azure doc pages that reference it. As soon as you give me your 👍I'll push a doc PR.

@TomonoriSoejima let's clarify how to integrate your point about the port.

While working on this, I realized that the entire Azure Logs page might be refreshed. I'll create a follow-up issue for that.

TomonoriSoejima commented 1 year ago

@alaudazzi

While I think having a reference to https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-faq#what-ports-do-i-need-to-open-on-the-firewall would be sufficient to check for ports, there is a risk of dead link if Microsoft decides to change the page structure.

So I think it does not hurt to drop the port information. If we support team receives a case, we can easily assume what to do in case of an issue that is caused by a closed port and so forth.

alaudazzi commented 1 year ago

Thank you @TomonoriSoejima! would you mind updating the gdoc with the info required to check ports?

TomonoriSoejima commented 1 year ago

So for the port requirement I wanted to clarify was for how to integrate with eventhub and not related to storage account requirements and not applicable to be added for this https://docs.google.com/document/d/1dVftW_6UjU68m3XeeSYjj5hOXiL6t3-MEES2pL-YM9Y/edit

zmoog commented 12 months ago

@alaudazzi, I read through the doc: the flow is smooth, and the steps are crystal clear. I only left a nit comment.

Please, go ahead with the PR!

alaudazzi commented 12 months ago

Thank you for your feedback @zmoog! After discussion with @SubhrataK, it might be better to keep these instructions as generic as possible, and not too specific to the UI of a 3rd-party vendor that - as we already discussed -- is out of our control might go quickly out of sync with our docs.

alaudazzi commented 12 months ago

More generic version of this procedure:

To create the storage account:

  1. Sign in to the Azure Portal and create your storage account.

  2. While configuring your project details, make sure you select the following recommended default settings:

    • Hierarchical namespace: disabled
    • Minimum TLS version: Version 1.2
    • Access tier: Hot
    • Enable soft delete for blobs: disabled
    • Enable soft delete for containers: disabled
  3. When the new storage account is ready, you need to take note of the storage account name and the storage account access keys, as you will use them later to authenticate your Elastic application’s requests to this storage account.

CC @SubhrataK

zmoog commented 12 months ago

Makes sense.

This version retains the core information in step 2 and still has enough guidance for the average Azure user. Kudos @alaudazzi!

alaudazzi commented 12 months ago

@TomonoriSoejima I'm not sure I understand how to handle the update you suggest. Shall we address it in the PR https://github.com/elastic/integrations/pull/8666?

TomonoriSoejima commented 12 months ago

Simply put, please disregard my initial comment. I made a misleading comment here in the first place and the improvement I hoped to create was for event hub integration and should have been added to this issue in the first place.