awslabs / aws-solutions-constructs

The AWS Solutions Constructs Library is an open-source extension of the AWS Cloud Development Kit (AWS CDK) that provides multi-service, well-architected patterns for quickly defining solutions
https://docs.aws.amazon.com/solutions/latest/constructs/
Apache License 2.0
1.19k stars 240 forks source link

proposal: service logs querying construct #399

Open naseemkullah opened 2 years ago

naseemkullah commented 2 years ago

https://docs.aws.amazon.com/athena/latest/ug/querying-AWS-service-logs.html

setting the above up involves a lot of config that would be nice to abstract into a construct, I was wondering if maintainers agree

Use Case

to set up https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html and https://docs.aws.amazon.com/athena/latest/ug/cloudfront-logs.html via cdk

Proposed Solution

Create a service-logs-querying

Other


This is a :rocket: Feature Request

naseemkullah commented 2 years ago

@biffgaut thoughts?

biffgaut commented 2 years ago

While that sounds like an interesting Construct, it doesn't sound like a construct that we would publish. We're completely focused on small building blocks that can be combined into larger architectures. None of our constructs on their own is a solution, but they can be combined to create countless solutions. Check out our DESIGN_GUIDELINES.md file. If, while you're implementing this, you identify small building blocks within there might be opportunities there.

This sounds more like a Solution, we have another team that focuses on solutions. Your idea might also be an interesting PR for the Centralized Logging solution.

naseemkullah commented 2 years ago

i guess my question is if you deploy an aws-cloudfront-s3 solution construct with logging enabled, how do you query your distribution's logs? the idea would be a CloudFrontLogsTable construct that can be used as a building block.

if you are still not convinced feel free to close. Thanks

biffgaut commented 2 years ago

So if we translate your vision to the Solutions Constructs paradigm, are you thinking something like aws-athena-s3?

This could set up Athena in front of an S3 bucket used for logging (we would expose the Logging Buckets as properties of the constructs that create them). Defining the schema/queries would be the responsibility of the client - it would vary based on what service was creating the log and we would consider it business logic. While it would be tempting to try to provide it, the last time we yielded to that temptation was aws-apigateway-dynamodb and that ended up causing problems that forced us to go back and add additional functionality that skipped our attempts at business logic.

naseemkullah commented 2 years ago

I would imagine setting up the schemas from docs https://docs.aws.amazon.com/athena/latest/ug/cloudfront-logs.html via cdk, I'd argue that it isnt business logic but rather more of an operational concern (ingress logs).

For my org I have created an athena DB, an athena workgroup, a query results bucket [these 3 should be user provided] as well as the glue table with schema +TableInput settings as per above doc (this is the part that i would like to turn into a construct and distribute as it was kind of long to do for such a basic necessity [queryable ingress logs]).

biffgaut commented 2 years ago

If we created the construct, it would be for Athena to connect to any data on S3, not just Cloudfront logs. To hardcode a specific format in the construct would lead to different constructs to read ALB logs, VPC flow logs, etc., and also not provide a method for a customer to read their own custom data.

Once we had the general case where users provided their own schema, we could entertain the idea of a library of predefined schema they could choose from.

biffgaut commented 2 years ago

A user provided schema, but the ability to get that schema from a pre-populated list of schemas in the construct is an interesting idea. Let me toss it around with the team.

I'm thinking props like:

dataSchema?: schemaType
predefinedSchema?: enum    // eg CloudFront | FlowLogs | ApplicationLoadBalancer

One must be provided, providing both is an error.

naseemkullah commented 2 years ago

ah, that sounds interesting, bare in mind that outside of the schema, you'll need to allow for property overriding as follows:

// escape hatch for missing config option in glue.Table
  // ref.: https://github.com/aws/aws-cdk/issues/16660
  propertyOverrides.forEach(({ propertyPath, value }) => {
    (table.node.defaultChild as glue.CfnTable).addPropertyOverride(
      propertyPath,
      value
    );
  });