Closed zmoog closed 11 months ago
The azure-eventhub
input uses a storage account as a checkpoint store for the Event Processor Host (EPH). See https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-event-processor-host for more.
Recommendations:
Don't enable the soft delete feature on the storage account that's used as a checkpoint store. Don't use a hierarchical storage (Azure Data Lake Storage Gen 2) as a checkpoint store.
Also, from https://learn.microsoft.com/en-us/azure/event-hubs/event-processor-balance-partition-load#checkpoint (this doc is for the newer Azure SDK, but they may apply to the legacy checkpointing implementation):
On the Storage account page in the Azure portal, in the Blob service section, ensure that the following settings are disabled.
- Hierarchical namespace
- Blob soft delete
- Versioning
As a starting point, here's the recommended default settings:
Use the storage account as checkpoint store only.
In addition, please also consider adding required port based on https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-faq#what-ports-do-i-need-to-open-on-the-firewall
@zmoog and I met on Nov 22 and 23. Meeting notes:
@zmoog Which Azure integrations are impacted by these doc changes?
Here is the list of integrations that the page https://docs.elastic.co/integrations/azure helps to set up:
$ tree -L 1 packages/azure/data_stream/
packages/azure/data_stream/
├── activitylogs
├── application_gateway
├── auditlogs
├── eventhub
├── firewall_logs
├── identity_protection
├── platformlogs
├── provisioning
├── signinlogs
└── springcloudlogs
10 directories, 0 files
@zmoog thank you again for you great intro to Azure integrations in general, and for providing me with access to the Azure portal to test and update the doc procedure.
Here is the doc fix for the section on storage account settings. This setup procedure is documented in the main Azure Logs page, but there are 10 Azure doc pages that reference it. As soon as you give me your 👍I'll push a doc PR.
@TomonoriSoejima let's clarify how to integrate your point about the port.
While working on this, I realized that the entire Azure Logs page might be refreshed. I'll create a follow-up issue for that.
@alaudazzi
While I think having a reference to https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-faq#what-ports-do-i-need-to-open-on-the-firewall would be sufficient to check for ports, there is a risk of dead link if Microsoft decides to change the page structure.
So I think it does not hurt to drop the port information. If we support team receives a case, we can easily assume what to do in case of an issue that is caused by a closed port and so forth.
Thank you @TomonoriSoejima! would you mind updating the gdoc with the info required to check ports?
So for the port requirement I wanted to clarify was for how to integrate with eventhub and not related to storage account requirements and not applicable to be added for this https://docs.google.com/document/d/1dVftW_6UjU68m3XeeSYjj5hOXiL6t3-MEES2pL-YM9Y/edit
@alaudazzi, I read through the doc: the flow is smooth, and the steps are crystal clear. I only left a nit comment.
Please, go ahead with the PR!
Thank you for your feedback @zmoog! After discussion with @SubhrataK, it might be better to keep these instructions as generic as possible, and not too specific to the UI of a 3rd-party vendor that - as we already discussed -- is out of our control might go quickly out of sync with our docs.
More generic version of this procedure:
To create the storage account:
Sign in to the Azure Portal and create your storage account.
While configuring your project details, make sure you select the following recommended default settings:
When the new storage account is ready, you need to take note of the storage account name and the storage account access keys, as you will use them later to authenticate your Elastic application’s requests to this storage account.
CC @SubhrataK
Makes sense.
This version retains the core information in step 2 and still has enough guidance for the average Azure user. Kudos @alaudazzi!
@TomonoriSoejima I'm not sure I understand how to handle the update you suggest. Shall we address it in the PR https://github.com/elastic/integrations/pull/8666?
Simply put, please disregard my initial comment. I made a misleading comment here in the first place and the improvement I hoped to create was for event hub integration and should have been added to this issue in the first place.
Some users reported they need to learn how to configure the storage account required to run the Azure module or the Azure Logs integration.
For example, users are looking for guidance on storage account settings like:
We must add this information to the current documentation.
Here's a list of the current version for Filebeat and Agent: