theforeman / foreman_fog_proxmox

Foreman plugin to add Proxmox compute resource based on fog-proxmox gem
GNU General Public License v3.0
106 stars 31 forks source link

Authentication fails after time without a reboot of Foreman #165

Closed amanshu closed 3 years ago

amanshu commented 3 years ago

Describe the bug We seem to be having regular authentication failures between Foreman and our Proxmoxes (one live and one test). After setup they authenticate correctly, however after some time they fail to authenticate. A reboot of Foreman fixes the error

To Reproduce Steps to reproduce the behavior:

  1. Go to 'Infrastructure -> Compute Resource'
  2. Click on 'Create Compute Resource'
  3. Enter Name, Provider (Proxmox), Url, Username, Password
  4. Click 'Test failed' to confirm connection and then 'Submit' to save it.
  5. Click the new Compute Resource to see the info page.
  6. Wait 12 hours
  7. Click the new Compute Resource and the below error message is shown.
  8. Log onto the server and run 'service foreman restart'
  9. When it's restarted, reload the page and click the new Compute Resource.
  10. The info page is shown again.

Error message Oops, we're sorry but something went wrong ERF42-4318 [Foreman::Exception]: Failed retrieving proxmox identity client caused by Expected([200, 204]) <=> Actual(401 Unauthorized) excon.error.response :body => "{\"data\":null}" :cookies => [ ] :headers => { "Cache-Control" => "max-age=0" "Connection" => "close" "Content-Length" => "13" "Content-Type" => "application/json;charset=UTF-8" "Date" => "Tue, 22 Sep 2020 16:25:36 GMT" "Expires" => "Tue, 22 Sep 2020 16:25:36 GMT" "Pragma" => "no-cache" "Server" => "pve-api-daemon/3.0" } :host => "X.X.X.X" :local_address => "Y.Y.Y.Y" :local_port => 59192 :path => "/api2/json/access/ticket" :port => 8006 :reason_phrase => "authentication failure" :remote_ip => "X.X.X.X" :status => 401 :status_line => "HTTP/1.1 401 authentication failure\r\n"

production.log

neomilium commented 3 years ago

Same problem here with a delay surely less than 12 hours. I will continue to investigate but its hard to fix due to this delay.

At first sight, it may be related to ticket renew.

@tristanrobert Could explain us how the ticket renew is expected to work? Why the lifetime is set to 2h? Is it a setting in Proxmox?

tristanrobert commented 3 years ago

Proxmox authentication tickets have a 2 hours lifetime as written here: https://pve.proxmox.com/wiki/Proxmox_VE_API Token expiration is computed here: https://github.com/fog/fog-proxmox/blob/b7f69d0ae9e4d3427fdc11efc708cf812509fd97/lib/fog/proxmox.rb#L61

neomilium commented 3 years ago

Thank you @tristanrobert . I dig a bit the trouble and read Proxmox documentation and found that API tokens have been implemented in Proxmox. So I redacted a feature suggestion here.

neomilium commented 3 years ago

I have a test suite that runs against Foreman that made many requests to it. These requests involve Proxmox operations.

When running this test suite, I do have authentication failure as reported by @amanshu.

I put a little hack on lib/fog/proxmox.rb you mention to force ticket renew:

def self.credentials_has_expired?
-      authenticated? && @credentials[:deadline] < Time.now
+      authenticated?
end

And :tada: I do not have authentication failures anymore.

As Foreman uses puma and workers, it may be caused by a race condition: @credentials are not shared between workers/threads.

I'll continue to debug to confirm or not this situation.

mkbrown commented 3 years ago

I'll just add that I'm also seeing the issue with authentication and timeouts. 'foreman-maintain service restart' gets it working for a short time as well (rather than rebooting the foreman server). Haven't tried @neomilium hack yet, but seriously considering it... The Foreman GUI has refresh issues when the timeout happens, especially on the hosts page if it's configured to show power status.

mkbrown commented 3 years ago

As an FYI, applied the @neomilium hack, and still have the authentication timeout issues. The Foreman GUI refresh issue seems to be independent of the authentication issue, as it happened after a foreman-maintain service restart.

IncredibleRichie commented 3 years ago

We´ve got the same Problem, it seems to occur when more than one compute resource is associated with foreman. If you try it often, sometimes it works and sometimes not. Maybe there is a problem with the tokens?

tristanrobert commented 3 years ago

Another way to prevent this cache deadline tokens issue should be not to save it in cache and make a request to proxmox api to check if token is still valid but we couldn't show anymore token's lifetime to the users.

mkbrown commented 3 years ago

As an additional data point as per @IncredibleRichie last post, I also have two compute resources defined. A Libvirt host, and the Proxmox host. In case that's a contributing factor... Thanks!

KoffeinKaio commented 3 years ago

I've got only one compute ressource beeing proxmox and got this Problem too. Also neomilium's hack doesnt work for me, I have to reboot/restart foreman everytime I want to interact with proxmox :/

tristanrobert commented 3 years ago

Did you tried to check box Renew expired token? in compute ressource and push Test Connection button? It forces a renew ticket.

tristanrobert commented 3 years ago

If ticket is lost, so it can not be renewed because it is required by proxmox, you can always delete the computer ressource and create a new one. You will get a fresh new ticket. Then you just have to attach the servers to be managed by foreman again.

tristanrobert commented 3 years ago

Maybe the checkbox Renew expired token? should always fetch a new ticket (expired or not) and then renamed Refresh ticket? .It could be a better way than delete the compute ressource and create a new one. If you agree add :+1:

KoffeinKaio commented 3 years ago

Did you tried to check box Renew expired token? in compute ressource and push Test Connection button? It forces a renew ticket.

Doest work, tried it with and without - doesnt seem to make a difference.

If ticket is lost, so it can not be renewed because it is required by proxmox, you can always delete the computer ressource and create a new one. You will get a fresh new ticket. Then you just have to attach the servers to be managed by foreman again.

I would have to readd the Resource daily, that is not an option.

tristanrobert commented 3 years ago

When you push Test Connection button and Renew expired token? is unchecked it refreshes a new ticket.

RedChops commented 3 years ago

The test fails with or without Renew expired token? checked on my install. It seems like the only thing that will allow the connection to work again is restarting foreman

claneys commented 3 years ago

I agree. Uncheck then test Connection make the connection ok but it isn't sufficient. :/

Nascire commented 3 years ago

The test fails with or without Renew expired token? checked on my install. It seems like the only thing that will allow the connection to work again is restarting foreman

Same for me