vmware / open-vm-tools

Official repository of VMware open-vm-tools project
http://sourceforge.net/projects/open-vm-tools/
2.25k stars 425 forks source link

libtimeSync syncs time from hypervisor to vm, even when disabled #302

Open Gunni opened 5 years ago

Gunni commented 5 years ago

Hey,

So about four weeks ago i was setting up a vm on a new hypervisor in a different country. The clock on it kept going wrong, all my systems run on UTC+0 and it was jumping to the local time (wrong too since the host was unsynchronized, which i discovered later), even ntpd was panic exiting (panic threshold, not crash).

After a while of debugging and watching journal and more, we found that when vmtoolsd was left running or restarted, that's when the time jumped. So I verified that time syncing was indeed disabled in the system settings.

But ... apparently libtimeSync doesn't care, and syncs on start and then regularily after that, overriding the good work of ntpd and setting the clock wrong.

Oct 03 19:52:41 x vmtoolsd[17119]: [Oct 03 18:55:19.393] [   debug] [timeSync] Using BDOOR_CMD_GETTIMEFULL_WITH_LAG
Oct 03 19:52:41 x vmtoolsd[17119]: [Oct 03 18:55:19.393] [   debug] [timeSync] Using BDOOR_CMD_GETTIMEFULL_WITH_LAG
Oct 03 19:52:41 x vmtoolsd[17119]: [Oct 03 18:55:19.393] [   debug] [timeSync] One time synchronization: stepping time.

Obviously the hypervisor clock should be correct, but to me it's irrelevant, ntpd syncs time and every server i know of has it, why force enable a feature THAT'S DISABLED IN THE UI!

Now i was onto something, and confirmed that, not only was it affecting that VM, but all VMs, heck, even the ntpd anycast VMs servicing the friggin hypervisors. In those cases the jumps were small, but noticable when looking into the data.

My solution: Delete libtimeSync.so, confirm that the clock is unmolested, add to ansible role, deploy to all systems.

Oct 03 19:20:16 x vmtoolsd[19517]: [Oct 03 19:20:11.531] [   debug] [vmsvc] RpcChannel: Unknown Command 'Time_Synchronize': Handler not registered.

Result: Offsets are all zero, or as close as they are going to get.

Suggestion: Respect the sync time flag from the VM settings page (I have not found a SINGLE system with it checked!)

Alternate solution: Don't check time difference to hypervisor at-all to try to correct it, just notify ntpd that a possible time altering event (suspend/resume, snapshot, disk, etc...) has occurred and it should verify that its time is correct and jump to the correct time if required. For example, a signal to ntpd...

PaTHml commented 5 years ago

Can you provide version information for the products and guest OS involved?

Guest OS: linux distro version and kernel version open-vm-tools: vmware-toolbox-cmd -v or vmtoolsd -v hypervisor: vmware -v and uname -a Is there a vCenter present? versions for that too

If you can specify "which UI" was used for the configuration (ESX, VI client, vCenter), that will help.

From the guest OS Get time sync status: vmware-toolbox-cmd timesync status

Disable time sync: vmware-toolbox-cmd timesync disable

or "enable" to enable time sync

Have a look at: https://kb.vmware.com/s/article/1189;

If you can provide the relevant .vmx settings mention in the article we can check that too.

Gunni commented 5 years ago

The feature causing me issues is time.synchronize.tools.startup which defaults to 1 in vCenter:

RpcIn: received 42 bytes, content:"Set_Option time.synchronize.tools.enable 1"
RpcIn: received 42 bytes, content:"Set_Option time.synchronize.guest.resync 0"
RpcIn: received 50 bytes, content:"Set_Option time.synchronize.guest.resync.timeout 0"
RpcIn: received 52 bytes, content:"Set_Option time.synchronize.tools.startup.backward 0"
RpcIn: received 43 bytes, content:"Set_Option time.synchronize.tools.startup 1"

It needs to be possible to configure these on the VM itself.

Can I override these in tools.conf?

# cat /etc/os-release | grep -i pretty
PRETTY_NAME="CentOS Linux 7 (Core)"
# vmware-toolbox-cmd -v
10.2.5.3619 (build-8068406)
# vmtoolsd -v
VMware Tools daemon, version 10.2.5.3619 (build-8068406)
# vmware-toolbox-cmd timesync status
Disabled
stanguturi commented 5 years ago

Can I override these in tools.conf?

There is no way to do this in tools.conf

vmware-toolbox-cmd timesync status Disabled

Check the following URLs: (Copied from https://github.com/vmware/open-vm-tools/issues/277#issuecomment-510982557)

All the options mentioned in the above KB articles need to be set to FALSE to completely disable the time synchronization inside the guest VM. Please reopen if you encounter the issue even after ALL the settings are set to FALSE.

Gunni commented 5 years ago

So if i understand you right, if i can't change the vCenter settings, then, it just sucks to be me?

And the client has no way to override that, except by DELETING a part of the program. Or by not running it in the first place?

If that's what you think, here's what i'm doing, for use by future annoyed sysadmins like me.

- name: Install VMware tools
  yum:
    name: open-vm-tools
    state: present
  when:
  - ansible_virtualization_role is defined
  - ansible_virtualization_role == "guest"
  - ansible_virtualization_type == "VMware"
  tags:
  - vm

- name: Ensure VMware timeSync will never happen
  file:
    path: /usr/lib64/open-vm-tools/plugins/vmsvc/libtimeSync.so
    state: absent
  notify:
  - restart vmtoolsd
  when:
  - ansible_virtualization_role is defined
  - ansible_virtualization_role == "guest"
  - ansible_virtualization_type == "VMware"
  tags:
  - latest  # to ensure this task runs even when the package is updated
  - vm

- name: Start and enable VMware tools
  service:
    name: vmtoolsd
    state: started
    enabled: true
  when:
  - ansible_virtualization_role is defined
  - ansible_virtualization_role == "guest"
  - ansible_virtualization_type == "VMware"
  tags:
  - vm
stanguturi commented 5 years ago

Had a discussion with other team members about this issue. Logged an internal bug for tracking this issue. Reopening the issue.

Firefishy commented 4 years ago

@stanguturi Any progress?

Firefishy commented 4 years ago

Here is how to completely disable this from the Guest in chef: https://github.com/openstreetmap/chef/blob/ce7189b3fd44901074b45fe76e143ce548393a75/cookbooks/hardware/recipes/default.rb#L97-L108

stanguturi commented 4 years ago

@stanguturi Any progress?

We are working internally with the respective team. We will update this issue when we have a specific update about the release.

sbueringer commented 4 years ago

@stanguturi Any updates?

stanguturi commented 4 years ago

@stanguturi Any updates?

No update yet. Will keep you posted. Thanks.

Gunni commented 1 year ago

Over 4 years since I reported this issue, over 2 years since your last response.

How is it going @stanguturi?

Gunni commented 1 week ago

Had a discussion with other team members about this issue. Logged an internal bug for tracking this issue. Reopening the issue.

Hey @stanguturi how is this issue going? Any update on the internal ticket? This issue is starting school soon, since it's reaching 6 years old...

jonathanvmw commented 6 days ago

@Gunni - Thanks for the reminder. We pinged the responsible development team and it looks like this will be getting some attention. Will post here when there is further information.