boxidau / rax-autoscaler

Apache License 2.0
5 stars 2 forks source link

use instance ID from '/var/lib/cloud/data/instance-id' #90

Closed siso closed 9 years ago

siso commented 9 years ago

@jon-walton @tedd "were just digging around cloud-init and found this /var/lib/cloud/data/instance-id"

Is there any enhancement with this approach?

jon-walton commented 9 years ago

Upside:

The instance-id file is populated by cloud-init when the server is built so it would be an extra source of truth before we fallback to xenstore-read.

Downside:

Cloud-Init doesn't seem to be used in our centos 5.11 or ubuntu 10.04 images

Also thinking about the cache, i'm wondering how reliable it is to check the system uptime. If we somehow write an uptime value of 1 (or another low value) to the cache (perhaps we had a really quick boot one time), we will never re-read from a source of truth.

Perhaps we can use /proc/sys/kernel/random/boot_id and check that against a cached value to see if we've rebooted

@Teddy-Schmitz

boxidau commented 9 years ago

instance-id is not a source of truth for the actual instance ID as reported by xenstore-read. Instance-id as created by cloud-init can be used as a unique ID for a new server; however as suggested cloud-init does not run on all platforms.

I'm not a huge fan of the current method to determine if the cache file should be used

 if int(cache_content[0]) < int(server_uptime):

This will use the cache file in a scenario for example where the cache file was written 20 seconds after boot, the server is then images and takes 30 seconds to boot the new instance. The file will be invalid.

it should be something like this

boot_timestamp = get_last_boot_time()
cache_mod_timestamp = get_cache_file_last_modified()

if boot_timestamp > cache_mod_timestamp:
  # read xen store
else:
  # read cache
sthapa-ping commented 9 years ago

Jon introduced uuid package in latest code update and after reviewing its documentation I think we can do more with it. We retired dmidecode and started using /proc/uptime and now we are thinking of instance-id; this could be a bottleneck in future if we want rax-autoscaler to support other OS in the future.

UUID package generates a unique id identified by its MAC address of any active network interface which we can use as cache filename to store returned value generated by xenstore-read. This way we don’t have to rely on hard coded path '/proc/uptime' or '/var/lib/cloud/data/instance-id' and number of xenstore-read call can be reduced.

boxidau commented 9 years ago

@jon-walton @sura8257 yep looks good, I'm very happy with that :)

tuxeh commented 9 years ago

alright, so that would look like

if unique uuid file exists:
    read cache
else:
    read xenstore
    write cache

quick testing locally has always returned the same MAC address when multiple adaptors are installed but the documentation does leave it open to return a random one if it can't get a mac address and it also doesn't state which mac address is used.

The worst case scenario would be that we read from xenstore each time because we get a random mac address back from uuid.getnode()

Maybe I'm overthinking this but I would prefer not to have the possibility of creating a new cache file every time (and depleting the xenstore read limit). Another unique identifier for *nix systems is /var/lib/dbus/machine-id , this is available to any system running dbus and will not change if the hardware changes (http://0pointer.de/blog/projects/ids.html). If we're going to support multiple operating systems (I'm assuming you mean Windows?), we'll need branches in code anyway as xenstore-read isn't available and we'll need to use WMI http://support.citrix.com/article/CTX136422

boxidau commented 9 years ago

Unfortunately depleting xenstore requests is not an option. Many Openstack functions depend on xenstore being functional, eg. Password resets, network stack changes, etc

sthapa-ping commented 9 years ago

@tuxeh When I tested like you, I also got static MAC address but documentation does mention that it could use any active interface. The worse it could get is we will have cache file for each active interface and xenstore-name will end up doing that many times of query. I think it's far better than what it (rax-autoscaler) was initially doing and relying on hard coded static path. Code will also look bit simpler than what we have now.

siso commented 9 years ago

Good point, we need to start thinking of portability.

uptime is easy to get and portable on other platforms, the worst case scenario is having to use subprocess.

I think UUID approach is more robust though. Why don't we simply use metadata? If a server does not have UUID attribute, it generates a new UUID, sets its key/value attribute, and look for duplicates in the scaling group?

siso commented 9 years ago

Hey @tuxeh, are you a RAX developer? ;)

boxidau commented 9 years ago

I guess we could always use a file in /tmp or like /dev/shm that way it will be regenerated every boot since the cache file will go away :)

ghost commented 9 years ago

If xenstore-read turns out to be unreliable and if we are still not happy with uptime, instance_id or uuid package, another alternative could be using output of "last reboot | head -1" to cache uuid.

jon-walton commented 9 years ago

Fixed in a9e3f5a25ffd5edf0faad5aca33eb76944795eb1