carribeiro / vdeli

Video Delivery Network
2 stars 0 forks source link

Implement a Logfile model to store information about daily CDNServer's logfiles #51

Open carribeiro opened 13 years ago

carribeiro commented 13 years ago

We decided to add a new model, called Logfile, to manage the local copies of the cdnserver's logfiles.

Logfiles in each cdnserver are stored and rotated daily. The cdnmaanger will retrieve them and filter in order to generate traffic & utilization reports. The cdnmanager will retrieve the logfile at 00:15 (server localtime), and keep track of the processing in the Logfile entry.

We have to deal with timezone issues. The logfile itself will always use UTC, but each server will work with its own timezone. There are two main reasons for that: first, to avoid retrieving all log files at the same time; and second, to always retrieve logfiles in a moment of low traffic.

carribeiro commented 13 years ago

Implementation plan:

  1. CDN servers are configured to run using a local timezone. I prefer to make it this way because it makes more sense when you log at the server, you know the time of the place where it's running, and also, we can schedule the logrotate cron job for a better hour. However, the logfile entries have to be always written with UTC-based timestamps.
    • To understand why the server has to run localtime, inagine that you have a lot of servers - over one hundred, for example. If all servers are configured to run UTC, then all the logfiles will be rotated (and copied) at the same time.
    • For some time, 00:00 UTC will be a good time for rotation. But for others it may be a very busy hour.
    • Having the servers run localtime makes it a little bit easier to manage and understand them, and sync their clocks to something that makes more sense for the location where they are installed.
  2. In the CDN Manager, each server will be assigned to a timezone, which is an integer from -12 to +12. The timezone is not the actual timezone from the OS (which is a much more complex setting, which includes daylight savings time), but just a simple offset from UTC.
  3. The CDN Manager will have a periodic task that will be run every hour, at 00:15, 01:15, 02:15, etc. At each hour the task will do the following:
    • Create new logfile entries for the files that have to be queued for copy in that hour.
    • Select logfile entries that have not been copied yet, and one by one, try to copy them.
    • We can make it sequentially, or (better) we can schedule more tasks so we can queue the file copies, and they can be run in parallel (if we have more workers).
    • If the copy fails, try to copy the same file again in the next hour. Try at most 3 times (we have to add a counter to the logfile model).