ddelnano / packer-plugin-xenserver

A builder plugin for Packer.IO to support building XenServer images.
Mozilla Public License 2.0
76 stars 37 forks source link

Problems running packer against deployments with a large number of hosts #12

Closed fabiorauber closed 3 years ago

fabiorauber commented 3 years ago

I'm trying to use packer-builder-xenserver with XCP-ng 7.6 and 8.0, using the provided ubuntu 20.04 example (had to update the json file to use Ubuntu 20.04.2 instead of 20.04.1).

Unfortunately, testing in both XCP-ng versions, the build encounters an error trying to open the VNC session, as shown below. I'm using packer version 1.6.6, tested both in Ubuntu 18.04 and MacOS Big Sur.

Debug mode enabled. Builds will not be parallelized.
xenserver-iso: output will be in this color.

==> xenserver-iso: XAPI client session established
==> xenserver-iso: Starting HTTP server on port 8000
==> xenserver-iso: Step: Create Instance
==> xenserver-iso: Using the following SR for the VM: OpaqueRef:eac24214-9929-4fbc-b146-3e006d5e3b0f
==> xenserver-iso: Created instance '5e5b9288-490e-afec-1246-8bdf0435829a'
==> xenserver-iso: Step: Start VM Paused
==> xenserver-iso: Step: Set SSH address to VM host IP
==> xenserver-iso: Set host SSH address to '172.31.16.93'.
==> xenserver-iso: Unpausing VM 5e5b9288-490e-afec-1246-8bdf0435829a
==> xenserver-iso: Waiting 10s for boot...
==> xenserver-iso: Connecting to the VM console VNC over xapi
==> xenserver-iso: Making HTTP request to initiate VNC connection: CONNECT /console?uuid=0f1ca5fb-8ee1-2050-cc05-aaf8f436b789 HTTP/1.0
==> xenserver-iso: Cookie: session_id=OpaqueRef:d6ceda20-01cc-479a-b699-03136f2b59ed
==> xenserver-iso:
==> xenserver-iso:
==> xenserver-iso: Received response: HTTP/1.0 500 Internal Error
==> xenserver-iso: content-length: 246
==> xenserver-iso: content-type:text/html
==> xenserver-iso: connection:close
==> xenserver-iso: cache-control:no-cache, no-store
==> xenserver-iso:
==> xenserver-iso: <html><body><h1>HTTP 500 internal server error</h1>An unexpected error occurred; please wait a while and try again. If the problem persists, please contact your support representative.<h1> Additional information </h1>Console.Failure</body></html>
==> xenserver-iso: Error establishing VNC session: EOF
==> xenserver-iso: Deleting output directory...
Build 'xenserver-iso' errored after 20 seconds 240 milliseconds: Error establishing VNC session: EOF

==> Wait completed after 20 seconds 240 milliseconds

==> Some builds didn't complete successfully and had errors:
--> xenserver-iso: Error establishing VNC session: EOF

==> Builds finished but no artifacts were created.

XCP-ng console shows the following error in xensource.log:

Mar  1 11:11:43 server2 xapi: [error|server2|12127597 INET :::80|Connection to VM console R:d1a466e33c00|console] VM OpaqueRef:f5710702-48a5-4f8c-9075-41dcf11d8cc6 (Console OpaqueRef:c4e0a5bd-d49e-442b-ad72-f52e741da145) has resident_on = OpaqueRef:206f7cf6-9d76-43b3-92ce-eeda190b5728 <> localhost
Mar  1 11:11:43 server2 xapi: [error|server2|12127597 INET :::80||backtrace] Connection to VM console R:d1a466e33c00 failed with exception Console.Failure
Mar  1 11:11:43 server2 xapi: [error|server2|12127597 INET :::80||backtrace] Raised Console.Failure
Mar  1 11:11:43 server2 xapi: [error|server2|12127597 INET :::80||backtrace] 1/1 xapi @ server2 Raised at file (Thread 12127597 has no backtrace table. Was with_backtraces called?, line 0
fabiorauber commented 3 years ago

Just tested with Packer 1.6.5: same result.

ddelnano commented 3 years ago

@fabiorauber sorry for responding late on this. I'm not sure why but I missed all the recent notifications for this repo.

I appreciate the detailed bug report and I'll need to spend some time looking into this error.

ebrainte commented 3 years ago

Im getting exacly the same error.

@ddelnano i saw that you tested the last release on your local env. Are you using xenserver or xcp-ng on your hosts?

ddelnano commented 3 years ago

I'm using xenserver but I believe I have tested this against XCP-ng albeit it was not the most recent testing. Let me give that a try today and see if I can reproduce your issue.

ddelnano commented 3 years ago

I just tested this against a XCP-ng 8.2.0 host and the VNC connection worked fine

==> xenserver-iso: Step: Create Instance
==> xenserver-iso: Using the following SR for the VM: OpaqueRef:fee129f3-e01c-47b5-be9c-6ec836b6bfe9
==> xenserver-iso: Created instance '3ad8e6c2-0bf9-6986-524e-c578c53ff8ae'
==> xenserver-iso: Step: Start VM Paused
==> xenserver-iso: Step: Set SSH address to VM host IP
==> xenserver-iso: Set host SSH address to '172.16.210.11'.
==> xenserver-iso: Unpausing VM 3ad8e6c2-0bf9-6986-524e-c578c53ff8ae
==> xenserver-iso: Waiting 10s for boot...
==> xenserver-iso: Connecting to the VM console VNC over xapi
==> xenserver-iso: Making HTTP request to initiate VNC connection: CONNECT /console?uuid=51c6c07f-622c-53f9-2649-b4b582db51a6 HTTP/1.0
==> xenserver-iso: Cookie: session_id=OpaqueRef:c5e70e94-ba57-4e2a-947c-e614c1ea98a2
==> xenserver-iso:
==> xenserver-iso:
==> xenserver-iso: Received response: HTTP/1.1 200 OK
==> xenserver-iso: Connection: keep-alive
==> xenserver-iso: Cache-Control: no-cache, no-store
==> xenserver-iso:
==> xenserver-iso:

Screenshot_20210311_183651 Screenshot_20210311_183610

@ebrainte @fabiorauber can you confirm that the "console" tab in the Xen orchestra UI works? As in the view you can see in my second screenshot.

The packer builder is connecting to the Xen console in the same fashion as that so I would expect them to both fail if that was the case.

@ebrainte what version of XCP-ng are you using?

ebrainte commented 3 years ago

Yes, vnc console works fine on XOA. What i didnt try, is with a pool with only 1 server (my pool has 18 servers on it).

Im using the XCP-ng 8.1.0.

Ive checked the XOA Source Code, and as you said, the method to connect is pretty much the same, but they add the header "Host" on the connection: https://github.com/vatesfr/xen-orchestra/blob/master/packages/xo-server/src/proxy-console.js line 34

Ive tried to add it on this module and the same is happening. I dont know what else to try

ddelnano commented 3 years ago

@ebrainte I would be surprised if the size of the pool mattered. While my Xenserver test environment is a single host, the XCP-ng deployment I used is a pool with 3 nodes.

ebrainte commented 3 years ago

@ddelnano finally, the issue was having too many hosts, hehe ;D

Ive found the issue. The module tries to create a TCP conn to the master host, instead of doing it directly to the host where the created vm is. That's why when you tried, it worked (probably because your VM booted on the master host).

I ran packer until the instance booted on the master host, and it worked, thats why i figured that out.

I will create a PR with the fix, but i want to keep testing more

fabiorauber commented 3 years ago

That explains a lot @ebrainte. The XCP-ng 7.6 Pool that was used for my tests has 11 servers, and the 8.0 one has 14. The odds of the machine starting on the Pool master in this situation are quite small.

@ddelnano, my XenOrchestra console works fine as well. I will try to run XCP-ng in a VM on my laptop, to compare the results.

4censord commented 3 years ago

You guys did just fix my problem i didn't know understand i had... Everything was working on my dev system with 1 host, but everything was failing on some bigger pools

ddelnano commented 3 years ago

haha I stand corrected and now in hindsight it makes alot of sense 😄

ebrainte commented 3 years ago

Ive created a PR for this issue: https://github.com/ddelnano/packer-plugin-xenserver/pull/15

ddelnano commented 3 years ago

Thanks to @ebrainte, the fix will be released in v0.3.1!