Closed siso closed 9 years ago
Upside:
The instance-id file is populated by cloud-init when the server is built so it would be an extra source of truth before we fallback to xenstore-read.
Downside:
Cloud-Init doesn't seem to be used in our centos 5.11 or ubuntu 10.04 images
Also thinking about the cache, i'm wondering how reliable it is to check the system uptime. If we somehow write an uptime value of 1 (or another low value) to the cache (perhaps we had a really quick boot one time), we will never re-read from a source of truth.
Perhaps we can use /proc/sys/kernel/random/boot_id and check that against a cached value to see if we've rebooted
@Teddy-Schmitz
instance-id is not a source of truth for the actual instance ID as reported by xenstore-read. Instance-id as created by cloud-init can be used as a unique ID for a new server; however as suggested cloud-init does not run on all platforms.
I'm not a huge fan of the current method to determine if the cache file should be used
if int(cache_content[0]) < int(server_uptime):
This will use the cache file in a scenario for example where the cache file was written 20 seconds after boot, the server is then images and takes 30 seconds to boot the new instance. The file will be invalid.
it should be something like this
boot_timestamp = get_last_boot_time()
cache_mod_timestamp = get_cache_file_last_modified()
if boot_timestamp > cache_mod_timestamp:
# read xen store
else:
# read cache
Jon introduced uuid package in latest code update and after reviewing its documentation I think we can do more with it. We retired dmidecode and started using /proc/uptime and now we are thinking of instance-id; this could be a bottleneck in future if we want rax-autoscaler to support other OS in the future.
UUID package generates a unique id identified by its MAC address of any active network interface which we can use as cache filename to store returned value generated by xenstore-read. This way we don’t have to rely on hard coded path '/proc/uptime' or '/var/lib/cloud/data/instance-id' and number of xenstore-read call can be reduced.
@jon-walton @sura8257 yep looks good, I'm very happy with that :)
alright, so that would look like
if unique uuid file exists:
read cache
else:
read xenstore
write cache
quick testing locally has always returned the same MAC address when multiple adaptors are installed but the documentation does leave it open to return a random one if it can't get a mac address and it also doesn't state which mac address is used.
The worst case scenario would be that we read from xenstore each time because we get a random mac address back from uuid.getnode()
Maybe I'm overthinking this but I would prefer not to have the possibility of creating a new cache file every time (and depleting the xenstore read limit). Another unique identifier for *nix systems is /var/lib/dbus/machine-id
, this is available to any system running dbus and will not change if the hardware changes (http://0pointer.de/blog/projects/ids.html). If we're going to support multiple operating systems (I'm assuming you mean Windows?), we'll need branches in code anyway as xenstore-read isn't available and we'll need to use WMI http://support.citrix.com/article/CTX136422
Unfortunately depleting xenstore requests is not an option. Many Openstack functions depend on xenstore being functional, eg. Password resets, network stack changes, etc
@tuxeh When I tested like you, I also got static MAC address but documentation does mention that it could use any active interface. The worse it could get is we will have cache file for each active interface and xenstore-name will end up doing that many times of query. I think it's far better than what it (rax-autoscaler) was initially doing and relying on hard coded static path. Code will also look bit simpler than what we have now.
Good point, we need to start thinking of portability.
uptime is easy to get and portable on other platforms, the worst case scenario is having to use subprocess.
I think UUID approach is more robust though. Why don't we simply use metadata? If a server does not have UUID attribute, it generates a new UUID, sets its key/value attribute, and look for duplicates in the scaling group?
Hey @tuxeh, are you a RAX developer? ;)
I guess we could always use a file in /tmp or like /dev/shm that way it will be regenerated every boot since the cache file will go away :)
If xenstore-read turns out to be unreliable and if we are still not happy with uptime, instance_id or uuid package, another alternative could be using output of "last reboot | head -1" to cache uuid.
Fixed in a9e3f5a25ffd5edf0faad5aca33eb76944795eb1
@jon-walton @tedd "were just digging around cloud-init and found this /var/lib/cloud/data/instance-id"
Is there any enhancement with this approach?