elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.94k stars 165 forks source link

edr send-report to Azure result in a timeout #1728

Closed janpetterholmberg closed 4 weeks ago

janpetterholmberg commented 4 weeks ago

Describe the bug When using edr send-report to Azure, I get a timeout; this happens if I use it locally from my PC and also if I use the docker image ghcr.io/elementary-data/elementary:latest

This is the command line I use:

edr send-report --select last_invocation --days-back 0 --update-bucket-website true --azure-container-name "<My container name>" --azure-connection-string "DefaultEndpointsProtocol=https;AccountName=<My account name>;AccountKey=<My secret key>;EndpointSuffix=core.windows.net"

I have looked at the code added in the pull request https://github.com/elementary-data/elementary/pull/1037 recreated the code, and executed it in the same Python environment as I run edr. This code works and uploads the file as expected

from azure.storage.blob import BlobServiceClient

CONNECT_STR ="DefaultEndpointsProtocol=https;AccountName=<My account name>;AccountKey=<My secret key>;EndpointSuffix=core.windows.net"
CONTAINER_NAME = "<My container name>"
blob_name = "testfile.yml"
local_html_file_path = "/home/janpette/profiles.yml"

blob_service_client = BlobServiceClient.from_connection_string(CONNECT_STR)
client = blob_service_client.get_blob_client(container=CONTAINER_NAME, blob=blob_name)
with open(local_html_file_path, "rb") as data:
    client.upload_blob(data, content_type="text/html", overwrite=True)

Also running edr report works as expected.

Here is the output from the edr send-report command

    ________                          __
   / ____/ /__  ____ ___  ___  ____  / /_____ ________  __
  / __/ / / _ \/ __ `__ \/ _ \/ __ \/ __/ __ `/ ___/ / / /
 / /___/ /  __/ / / / / /  __/ / / / /_/ /_/ / /  / /_/ /
/_____/_/\___/_/ /_/ /_/\___/_/ /_/\__/\__,_/_/   \__, /
                                                 /____/

Any feedback and suggestions are welcomed! join our community here - https://bit.ly/slack-elementary

2024-10-23 14:42:21 — INFO — Running with edr=0.16.1
2024-10-23 14:42:55 — INFO — edr (0.16.1) and Elementary's dbt package (0.16.1) are compatible.
2024-10-23 14:42:55 — INFO — Elementary's database and schema: '"XXXX.XXXXX"'
2024-10-23 14:42:55 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_test_results", "macro_args": {"days_back": 0, "invocations_per_test": 720, "disable_passed_test_metrics": false}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:43:31 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_source_freshness_results", "macro_args": {"days_back": 0, "invocations_per_test": 720}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:44:03 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_models", "macro_args": {"exclude_elementary": true}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:44:35 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_sources", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:45:07 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_exposures", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:45:38 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_singular_tests", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:46:10 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_models_runs", "macro_args": {"days_back": 0, "exclude_elementary": true}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:46:41 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_dbt_models_test_coverage", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:47:12 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_test_last_invocation", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:47:44 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_nodes_depends_on_nodes", "macro_args": {"exclude_elementary": true}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:48:16 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_models_latest_invocation", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:48:47 — INFO — Running dbt command --log-format json run-operation elementary.log_macro_results --args {"macro_name": "elementary_cli.get_models_latest_invocations_data", "macro_args": {}} --project-dir /home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/dbt_project
2024-10-23 14:49:19 — INFO — Uploading to Azure container "dbt-elementary"
Traceback (most recent call last):
  File "/home/janpette/.elementary_env/bin/edr", line 8, in <module>
    sys.exit(cli())
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/cli/cli.py", line 67, in invoke
    return super().invoke(ctx)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/cli.py", line 713, in send_report
    sent_report_successfully = data_monitoring.send_report(
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/data_monitoring/report/data_monitoring_report.py", line 205, in send_report
    upload_succeeded, bucket_website_url = self.upload_report(
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/monitor/data_monitoring/report/data_monitoring_report.py", line 264, in upload_report
    send_succeeded, bucket_website_url = self.azure_client.send_report(
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/elementary/clients/azure/client.py", line 43, in send_report
    blob_handle.upload_blob(data, content_type="text/html", overwrite=True)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_blob_client.py", line 596, in upload_blob
    return upload_block_blob(**options)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_upload_helpers.py", line 105, in upload_block_blob
    response = client.upload(
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 94, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_generated/operations/_block_blob_operations.py", line 848, in upload
    pipeline_response: PipelineResponse = self._client._pipeline.run(  # pylint: disable=protected-access
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 229, in run
    return first_node.send(pipeline_request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  [Previous line repeated 2 more times]
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/policies/_redirect.py", line 197, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_shared/policies.py", line 556, in send
    raise err
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_shared/policies.py", line 528, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  [Previous line repeated 1 more time]
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_shared/policies.py", line 301, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 86, in send
    response = self.next.send(request)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/_base.py", line 118, in send
    self._sender.send(request.http_request, **request.context.options),
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/storage/blob/_shared/base_client.py", line 350, in send
    return self._transport.send(request, **kwargs)
  File "/home/janpette/.elementary_env/lib/python3.10/site-packages/azure/core/pipeline/transport/_requests_basic.py", line 401, in send
    raise error
azure.core.exceptions.ServiceResponseError: ('Connection aborted.', TimeoutError('The write operation timed out'))

To Reproduce Steps to reproduce the behavior:

  1. Run the edr send-report command listed in the description

Expected behavior The files are uploaded to Azure blob storage

Screenshots

Environment (please complete the following information):

Additional context

Would you be willing to contribute a fix for this issue?

janpetterholmberg commented 4 weeks ago

This turned out to be a network issue, closing the bug