ManageIQ / manageiq

ManageIQ Open-Source Management Platform
https://manageiq.org
Apache License 2.0
1.34k stars 900 forks source link

Resource <Host> ... is not an eligible resource for this provisioning instance #22599

Open jbarson47 opened 1 year ago

jbarson47 commented 1 year ago

We are seeing an issue in one deployment where two users with identical roles + group memberships, provisioning a VM selecting the same network, see different results in placement. One user is successful while the other sees the error mentioned in the title of the issue.

I would also note that we are seeing some oddities with group assignments in this region, however we've verified these users had matching groups and current group here (and this group should have access to all hosts on this provider).

evm.log from unsuccessful user: [----] I, [2023-07-05T15:26:36.640960 #257726:46dcdc] INFO -- evm: Q-task_id([r855708_miq_provision_526501]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#get_ems_metadata_tree) EMS metadata collection completed in [0.277068139] seconds [----] I, [2023-07-05T15:26:37.107397 #257726:46dcdc] INFO -- evm: Q-task_id([r855708_miq_provision_526501]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#allowed_hosts_obj) allowed_hosts_obj returned [1] objects in [0.466067749] seconds [----] I, [2023-07-05T15:26:37.107569 #257726:46dcdc] INFO -- evm: Q-task_id([r855708_miq_provision_526501]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#filter_hosts_by_vlan_name) Filtering hosts with the following network: [----] I, [2023-07-05T15:26:37.129601 #257726:46dcdc] INFO -- evm: Q-task_id([r855708_miq_provision_526501]) MIQ(ManageIQ::Providers::Vmware::InfraManager::Provision#eligible_resources) returning :<> [----] E, [2023-07-05T15:26:37.730247 #257726:93a8] ERROR -- evm: Q-task_id([r855708_miq_provision_526501]) MIQ(MiqAeEngine.deliver) Error delivering {"request"=>"vm_provision"} for object [ManageIQ::Providers::Vmware::InfraManager::Provision.526501] with state [] to Automate:

evm.log from successful user: [----] I, [2023-07-05T02:36:35.015729 #257728:50938] INFO -- evm: Q-task_id([r855634_miq_provision_526421]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#get_ems_metadata_tree) EMS metadata collection completed in [0.287639572] seconds [----] I, [2023-07-05T02:36:35.723541 #257728:50938] INFO -- evm: Q-task_id([r855634_miq_provision_526421]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#allowed_hosts_obj) allowed_hosts_obj returned [20] objects in [0.707461981] seconds [----] I, [2023-07-05T02:36:35.723718 #257728:50938] INFO -- evm: Q-task_id([r855634_miq_provision_526421]) MIQ(ManageIQ::Providers::Vmware::InfraManager::ProvisionWorkflow#filter_hosts_by_vlan_name) Filtering hosts with the following network: [----] I, [2023-07-05T02:36:36.010054 #257728:50938] INFO -- evm: Q-task_id([r855634_miq_provision_526421]) MIQ(ManageIQ::Providers::Vmware::InfraManager::Provision#eligible_resources) returning :<159:hostname-esx04.domain.com, 160:hostname-esx02.domain.com,... [----] I, [2023-07-05T02:36:36.010330 #257728:50938] INFO -- evm: Q-task_id([r855634_miq_provision_526421]) MIQ(ManageIQ::Providers::Vmware::InfraManager::Provision#set_resource) option being set to <[220, "hostname-esx03.domain.com"]>

Fryguy commented 1 year ago

@jbarson47 Can you give more details on the users and roles?

jbarson47 commented 1 year ago

So these users have the same basic end user role as other users, however the groups appear abnormal. These are members of two end user groups, group T + group G, which should be mutually exclusive. Current group for the failing user is group G which is the one with access to the resources in question.

When doing an LDAP search on our end we see both these users in only one LDAP group that's set up for ManageIQ as expected (group G). However, when authentication occurs, we see an additional, different group (group T) being passed in as an "incoming group" to https://github.com/ManageIQ/manageiq/blob/6482c010c393876184fb83f657919c84f6bd7bf0/app/models/authenticator/base.rb#L139.

This causes the users to be part of both group T + group G in ManageIQ when they should only be in group G, and we see VM provisioning errors for some of these users as well (the access to eligible resource issues only seem to occur for some users as mentioned).

To add context, when tailing an authentication for a user in group T - we see the following

[----] D, [2023-07-05T15:29:29.742324 #272515:b3ce0] DEBUG -- evm: MIQ(Authenticator::Httpd#match_groups) External Group: group_t
[----] D, [2023-07-05T15:29:29.742454 #272515:b3ce0] DEBUG -- evm: MIQ(Authenticator::Httpd#match_groups) External Group: group_t 

And for a user in group G:

[----] D, [2023-07-05T15:26:13.482927 #272526:abd88] DEBUG -- evm: MIQ(Authenticator::Httpd#match_groups) External Group: group_t
[----] D, [2023-07-05T15:26:13.483033 #272526:abd88] DEBUG -- evm: MIQ(Authenticator::Httpd#match_groups) External Group: group_g

It almost seems like Group T is being assigned by default somehow when users authenticate regardless of their LDAP group (causing ALL users who log in to have group T membership), but I'm having trouble determining how this could be happening. Also noting that this is a new issue since the CFME v5.11 -> ManageIQ Najdorf transition in case that is helpful.

jbarson47 commented 1 year ago

Still not 100% certain it's related, but we have isolated the auth issues to the HTTP request headers containing Group T for every user regardless of group membership in LDAP.

From a custom debug statement: DEBUG -- evm: MIQ(Authenticator::Httpd#user_details_from_headers) User headers for jbarson: ..."X-REMOTE-USER-GROUPS"=>"group_t:other_group_1...:group_t:other_group_2...:other_group_3..."}

Executing this on a MIQ UI node for myself: dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserGroups string:jbarson

method return time=1688670909.922464 sender=:1.1821 -> destination=:1.1944 serial=20 reply_serial=2
   array [
      string "group_t"
      string "...1"
      string "group_t"
      string "...2"
      string "...3"
   ]

And for a user in only group G experiencing this issue:

dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserAttr string:ssvetosl array:string:memberof
method return time=1688671788.175504 sender=:1.1821 -> destination=:1.1957 serial=46 reply_serial=2
   array [
      dict entry(
         string "memberof"
         variant             array [
               string "name=group_g@domain.com,cn=groups,cn=domain.com,cn=sysdb"
            ]
      )
   ] 
dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserGroups string:ssvetosl
method return time=1688671791.193340 sender=:1.1821 -> destination=:1.1958 serial=48 reply_serial=2
   array [
      string "group_t"
      string "group_g"
   ] 

Investigating why these returns are being "padded" with group_t each time regardless of membership.

miq-bot commented 9 months ago

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

miq-bot commented 5 months ago

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

miq-bot commented 2 months ago

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.