Bug: Excessive bandwidth usage on viewing graphs for non-active instances

Both the libvirt plugin and vms-collectd-plugins don't (can't) collect data on non-active instances. The current behaviour is to treat the data at these time as null, and ignore these values (thus only the last valid data from when the instance was active is displayed).

The problem arises when instances have been shutdown for a long time. Since the canary rest api only supports querying for metrics with a "from_time", large amounts of tags such as [1377195290, "AVERAGE", null] are returned, in the order of a few MB per day the vm was shutdown. Requesting this per metric, per update interval (usually seconds) causes large amounts of network traffic, and with only a few graphs displayed, can easily reach gigabytes per hour. This bandwidth usage is excessive and slows down the page dramatically.

Proposed solutions:

1) Show only active instances (some backend work or extra nova queries will have to be done since queries to canary list only give host_name and instance_id)
2) Treat null as 0, thus vm's that are shutdown will have flat 0 graphs for every metric (this would require minimal work)
3) Add a "end_time" parameter to rest api.

EDIT:

This issue also exists in when viewing metrics that no longer get updated (eg. networks or disks that are deleted but still contain metrics from when they were around). Unlike shutdown instances which may be restarted, these devices may not come back, therefore displaying 0 may be misleading. Thus options boil down to:

A) Show last available statistics (add end_time parameter to support this)
B) Strictly adhere to show data for last n minutes (if all recent data is null display "No data"; unsure how graph would look if data stops in the middle)

gridcentric / canary

Bug: Excessive bandwidth usage on viewing graphs for non-active instances #9