republicofdata-io / damn

The DAMN (Data Assets Metric Navigation) tool extracts and reports metrics about your data assets
https://pypi.org/project/damn-tool/
12 stars 0 forks source link

Adding IO manager metrics for AWS #18

Closed olivierdupuis closed 1 year ago

olivierdupuis commented 1 year ago

Getting metrics from IO Manager. Currently limited to S3 storage service and only getting the size metric for a non-partitioned asset. Goal is to get a minimum working solution and then expand.

Config is defined in ~/.damn/connectors.yml file:

io-manager:
  aws:
    credentials:
      access_key_id: "{{ env('AWS_ACCESS_KEY_ID') }}"
      secret_access_key: "{{ env('AWS_SECRET_ACCESS_KEY') }}"
      region: "us-east-1"
    bucket_name: "discursus-io"
    key_prefix: "platform"

Credentials for AWS are set in environment variables.

Once configured, when running the metrics command for an asset, you should now get that additional stored file size metric.

Command: damn metrics gdelt/gdelt_gkg_articles

Results:

Screenshot 2023-07-10 at 14 55 33
olivierdupuis commented 1 year ago

10 - Extract metrics from S3