mitodl / ol-infrastructure

Infrastructure automation code for use by MIT Open Learning
BSD 3-Clause "New" or "Revised" License
43 stars 4 forks source link

Add Fastly logs as a catalog in Trino #1410

Open blarghmatey opened 1 year ago

blarghmatey commented 1 year ago

User Story

Description/Context

We have started to collect the Fastly logs as JSON files in S3. We would like to standardize the JSON schema generated by Fastly logs and expose them as a table definition in our Trino infrastructure. This will allow us to use this user traffic to enrich data and reports that we build for products that use Fastly for front-end caching.

Acceptance Criteria

blarghmatey commented 1 year ago

We can do this in a fairly straightforward manner by piping the log data through Airbyte to expose it as a table. The main work to be done as a pre-requisite is to thoroughly define the log schema so that it is consistent and includes all of the information that we would like to be able to report on.

Ardiea commented 8 months ago