Closed hartfordfive closed 7 years ago
The previously proposed solution was applied although still wasn't effective enough in terms of total time for downloading/processing.
Current log file download times can exceed 5 minutes and processing time can go up to 10 minutes.
Each ticker itteration is currently defaulted to 30 minutes but unfortunately the ticket for the next itteration doesn't start until the current processing is completed, so in this case it's (30 + 5 + 10) = 45 minutes.
Now keep on adding this time over the period of a day and within 24 hours you can easily end up with a 4 to 5 hour delay in processing time, instead of a more reasonable 30 minutes. These delays will only grow as the traffic increases and the log files increase in size.
Once the period has passed for beat's ticker, two functions would be called asynchronously to perform the following
log_files_ready
channel once completed.
log_files_ready
channel as they are ready
PublishEvent
This may not be the absolute best solution, but it should be more affective than the current one. If more optimizations need to be done later, I'll deal with it then.
Although the ELS API currently does allow a
count
of items to be specified along with a timestamp start and end rage, it does not return any type of header returning how many logs items in total are within given time range. Due to this, a large number of logs may be downloaded which can become quite heavy for in-memory processing.As a solution, the logs should initially be saved to a gzip file and then read from this file into smaller chunks.