Open ensean opened 2 years ago
Hi @ensean, from the docs:
CloudFront saves them in a log file for which the file name includes the date and time of the period in which the requests occurred, not the date and time when the file was delivered.
If this differs from what you have observed, can you please follow up with an AWS support request?
The pattern of partitioning can be applied also for other use cases. Depending on the case, the partitions can differ from some or all timestamp columns of the data queried and you need to adjust your queries to look at more partitions. I've explained it in this video.
Since the access log files are delivered to S3 asynchronously, a log file
E271AZ5HG504X.2022-01-20-07.2bd0b06.gz
may contents access log starts from2022-01-20 08:00:00
. If the log file is moved to partitionyear=2022/month=01/day=20/hour=07
, is it possible that a sql with where clausewhere year='2022' and month='01' and day='20' and hour='08'
may lost this part of data?