vmware / govmomi

Go library for the VMware vSphere API
Apache License 2.0
2.26k stars 896 forks source link

[BUG] ReadNextEvents method gets stuck #3450

Open tender-barbarian opened 1 month ago

tender-barbarian commented 1 month ago

Describe the bug

ReadNextEvents method of EventHistoryCollector managed object gets stuck.

It looks like this (simplification):

Why the gap? I assume it is because API user does not have access to all inventory objects. So events coming from inventory objects to which user does not have read access are not returned.

To Reproduce

Might be a bit hard to reproduce as I'm not sure if this can be simulated using vcsim since there is no user authorisation.

Steps to reproduce the behavior:

  1. Create an user without access to full inventory
  2. Login to API
  3. Get collector via CreateCollectorForEvents
  4. Attempt to read events via ReadNextEvents method - filter to start some time in the past so there will be enough events to pull
  5. Observe the behaviour - ReadNextEvents will be returing varying amount of events. For example, if given query size of a 100 events, it will be returning less or even get completely stuck if gap in events is bigger then provided query size.

Expected behavior

Collector should be only loading events for which given user have read rights. Currently it looks like it loads everything and then determines which events can be returned based on user access rights.

Affected version

Govmomi latest master VMware vCenter - all versions as far as I can tell

Additional context

Yes I know there is a simple solution to this - always use an user which have full inventory read rights. But for obvious reasons this is not ideal or even possible in some environments.

For example, in my case, we share vCenter with other entity - which does not want us accessing their events.

github-actions[bot] commented 1 month ago

Howdy 🖐   tender-barbarian ! Thank you for your interest in this project. We value your feedback and will respond soon.

If you want to contribute to this project, please make yourself familiar with the CONTRIBUTION guidelines.

paveljanda commented 1 month ago

👍

tgeek77 commented 1 month ago

👍

TomasFlam commented 1 month ago

:+1:

dougm commented 3 weeks ago

Hi folks, I'm not able to reproduce this using:

% govc about | grep FullName
FullName:     VMware vCenter Server 8.0.2 build-23319993

And with the example here: https://github.com/vmware/govmomi/tree/main/examples/events Created a user with limited permissions like so:

#!/bin/bash -e

pass=$(govc env GOVC_PASSWORD) # using same password as Administrator

create() {
  id="$1"

  # create a user
  if ! govc sso.user.id "$id" 2>/dev/null ; then
    govc sso.user.create -p "$pass" "$id"
  fi

  # create a role with limited permissions
  if ! govc role.ls "$id" 2>/dev/null ; then
    govc role.create "$id" $(govc role.ls Admin | grep VirtualMachine)
  fi

  # create a vm folder (relative to $GOVC_DATACENTER)
  folder="vm/$id"
  if ! govc object.collect "$folder" 2>/dev/null ; then
    govc folder.create "$folder"
  fi

  # grant user limited permisions for the folder
  govc permissions.set -principal "$id@vsphere.local" -role "$id" "$folder"
}

create limited

With a few dozen VMs in $folder the limited user can read and and few VMs outside of that, using govc vm.power on/off to generate events. Run the example as Administrator:

% export GOVMOMI_URL="Administrator@vsphere.local:$password@$vcenter_ip" GOVMOMI_INSECURE=true
% go run main.go -b 48h | wc -l
3097

Then as limited user:

% export GOVMOMI_URL="limited@vsphere.local:$password@$vcenter_ip" GOVMOMI_INSECURE=true
% go run main.go -b 48h | wc -l
2533

No hanging, but do see a few (expected):

... [EventEx] The user does not have permission to view the entity associated with this event

Are you able to reproduce the issue with the same example or other self-contained program you can share? Please also share your build number from govc about.

tender-barbarian commented 3 weeks ago

@dougm my guess is that it doesn't get stuck, because there is not enough enitites producing logs to cause big enough gap between events for collector to get stuck. But i will try your example :+1:

Our version: VMware vCenter Server 7.0.3 build-20990077

tender-barbarian commented 3 weeks ago

@dougm ok so I tried your script. Then moved single vm to limited folder and powered on/off several times to generate events.

Results

Admin user

% export GOVMOMI_URL="$admin_username@$domain:$password@$vcenter_ip" GOVMOMI_INSECURE=true
% go run main.go -b 48h | wc -l                                                                
    1090

Limited user

% export GOVMOMI_URL="limited@$domain:$password@$vcenter_ip" GOVMOMI_INSECURE=true 
% go run main.go -b 48h | wc -l                                                               
       0

Limited user again with shorter time

% export GOVMOMI_URL="limited@$domain:$password@$vcenter_ip" GOVMOMI_INSECURE=true 
% go run main.go -b 1h | wc -l                                                               
      12

And just to confirm those 12 events are power on/offs from vm in limited folder.

Conclusion

So as you can see, with 48h timeframe, collector does not return anything. I assume it's because it cannot skip more than 1000 events, but there were 1088 events to go through to get to events allowed for limited user. Once gap was shortened by using smaller timeframe, it started to work again.

All of this is just my assumption... I might be mistaken (I hope). So please tell me if I'm doing anything wrong.

I tried all of this on older vCenter in our test environment - VMware vCenter Server 6.7.0 build-15976728 as I didn't have user creation rights in our prod vCenter. But I observed the same behaviour on newer vCenter previously.

Try changing maxCount param in collector.ReadNextEvents(ctx, 100) for example to 10 and see if collector will get stuck.