Closed kikuomax closed 2 years ago
I think we need specific goals of access analysis.
I think we need specific goals of access analysis.
The columns of the fact table for access logs.
datetime
: TIMESTAMP
date
+ time
seq_num
: INT
edge_location
: INT
→ edge_location
dimension table
x-edge-location
sc_bytes
: BIGINT
sc-bytes
cs_method
: VARCHAR
cs-method
page
: INT
→ page
dimension table
cs-uri-stem
status
: SMALLINT
sc-status
referer
: BIGINT
DISTKEY
→ referer
dimension table
cs(Referer)
user_agent
: BIGINT
→ user_agent
dimension table
cs(User-Agent)
cs_protocol
: VARCHAR
cs-protocol
cs_bytes
: BIGINT
cs-bytes
time_taken
: FLOAT4
time-taken
edge_response_result_type
: INT
→ result_type
dimension table
x-edge-response-result-type
time_to_first_byte
: FLOAT4
time-to-first-byte
SORTKEY
: datetime
, seq_num
The columns of the edge_location
dimension table.
id
: INT
code
: VARCHAR
SORTKEY
UNIQUE
x-edge-location
The columns of the page
dimension table.
id
: INT
path
: VARCHAR(2048)
SORTKEY
UNIQUE
cs-uri-stem
The columns of the referer
dimension table.
id
: BIGINT
url
: VARCHAR(2048)
SORTKEY
UNIQUE
cs(Referer)
The columns of the user_agent
dimension table.
id
: BIGINT
user_agent
: VARCHAR(2048)
SORTKEY
UNIQUE
cs(User-Agent)
The columns of the result_type
dimension table.
id
: INT
result_type
: VARCHAR
SORTKEY
UNIQUE
x-edge-response-result-type
I have decided to designate this issue for development of the basic infrastructure. I will develop tools for analysis on top of the developed infrastructure in another issue.
I would like to analyze CloudFront logs to know traffic to my site. How about to try Amazon Redshift serverless?