vmware / vic-ui

vSphere Integrated Containers Plug-In for vSphere Client provides information about your VIC setup and allows you to deploy VCHs directly from the vSphere Client.
Other
26 stars 25 forks source link

VIC UI plugin cannot find vic-machine server at endpoint https://null:8443 #185

Open andrewtchin opened 6 years ago

andrewtchin commented 6 years ago

@gigawhitlocks commented on Tue Dec 05 2017

Performed a VIC UI plugin install as such:

root@sc2-rdops-vm01-dhcp-32-18 [ ~/vic/ui/VCSA ]# ./install.sh 
-------------------------------------------------------------
This script will install vSphere Integrated Containers plugin
for vSphere Client (HTML) and vSphere Web Client (Flex).

Please provide connection information to the vCenter Server.
-------------------------------------------------------------
Enter FQDN or IP to target vCenter Server: vsphere.theknown.net
Enter your vCenter Administrator Username: Administrator@vsphere.local
Enter your vCenter Administrator Password: 

SHA-1 key fingerprint of host 'vsphere.theknown.net' is 'EB:CA:89:8C:92:5F:FB:F4:68:0B:50:F1:4F:AE:7B:DB:56:D5:8E:70'
Are you sure you trust the authenticity of this host (yes/no)? yes

-------------------------------------------------------------
Checking existing plugins...
-------------------------------------------------------------
No VIC Engine UI plugin was detected. Continuing to install the plugins.

-------------------------------------------------------------
Preparing to register vCenter Extension vSphere Integrated Containers-FlexClient...
-------------------------------------------------------------
Dec  4 2017 21:37:45.155Z INFO  ### Installing UI Plugin ####
Dec  4 2017 21:37:45.252Z INFO  Installing plugin
Dec  4 2017 21:37:48.516Z INFO  Installed UI plugin

-------------------------------------------------------------
Preparing to register vCenter Extension vSphere Integrated Containers-H5Client...
-------------------------------------------------------------
Dec  4 2017 21:37:48.597Z INFO  ### Installing UI Plugin ####
Dec  4 2017 21:37:48.703Z INFO  Installing plugin
Dec  4 2017 21:37:51.728Z INFO  Installed UI plugin
Dec  4 2017 21:37:52.356Z INFO  Found 1 VM(s) tagged as OVA

Dec  4 2017 22:21:22.312Z INFO  Attempting to configure ManagedByInfo
Dec  4 2017 22:21:23.826Z INFO  Successfully configured ManagedByInfo

--------------------------------------------------------------
VIC Engine UI installer exited successfully

After install, this is seen: screen shot 2017-12-05 at 12 07 50 pm

The error provides a link to "fix" the issue, but that opens https://null:8443 which clearly doesn't work 😆


@gigawhitlocks commented on Tue Dec 05 2017

It's possible this issue is not related to the FQDN, but it may need to be noted that my VIC OVA also has an FQDN in this deployment, vch.theknown.net .


@gigawhitlocks commented on Tue Dec 05 2017

I have tried this again and provided an IP instead of an FQDN and got the same error; have updated the title of the issue. Will try a full reinstall of VIC OVA w/o a FQDN provided and I will see if I get a different result.

andrewtchin commented 6 years ago

the result of this is that the user cannot use the h5 plugin, marking as high, reset priority after triage

jooskim commented 6 years ago

Seems like this has to do with OVA network. The UI is getting the IP address from the vim25 API as shown below:

https://github.com/vmware/vic-ui/blob/master/h5c/vic-service/src/main/java/com/vmware/vic/PropFetcher.java#L430 https://github.com/vmware/vic-ui/blob/master/h5c/vic-service/src/main/java/com/vmware/vic/model/VicApplianceVm.java

jak-atx commented 6 years ago

@andrewtchin does this belong in vic-product? It looks like it to me.

andrewtchin commented 6 years ago

What's the root cause? Are you saying the OVA does not have an IP displayed in vCenter at the time this is run?

jak-atx commented 6 years ago

I believe that's what is happening. Somehow it's not finding an IP. @jooskim ^^

andrewtchin commented 6 years ago

I think the UI installer or something should retry or fail if it sees null as the appliance IP. I'm not clear on where the IP is coming from and at what stage this is being queried. When the appliance is first booted the IP doesn't show up in vSphere for a few seconds, but that's the only time I can think that would be the case since the toolbox reports the IP before you can even access the webserver on the appliance.

jak-atx commented 6 years ago

@gigawhitlocks can you try this again with our latest OVA? I'm not quite sure how to reproduce this.

andrewtchin commented 6 years ago

Where is the IP that is showing up as null coming from?

jak-atx commented 6 years ago

It's the appliance IP. It's that whole IP lookup mechanism that checks for the vic tagged vm that @jooskim and @jzt worked on. That's why I was thinking maybe the OVA seeing this issue maybe didn't have all of the pieces for that in place at that point.

andrewtchin commented 6 years ago

But doesn't that mean that the UI plugin is checking vSphere for the VM that has that tag? If so, that code should retry if it receives a null value and fail if it keeps getting null

jzt commented 6 years ago

From https://github.com/vmware/vic/blob/master/lib/install/ova/configure.go#L87:

func getOvaVMByTag(ctx context.Context, sess *session.Session, u string) (*vm.VirtualMachine, error) {
        ovaURL, err := url.Parse(u)
        if err != nil {
                return nil, err
        }

        host := ovaURL.Hostname()

        log.Debugf("Looking up host %s", host)
        ips, err := net.LookupIP(host)
        if err != nil {
                return nil, errors.Errorf("IP lookup failed: %s", err)
        }

        log.Debugf("found %d IP(s) from hostname lookup on %s:", len(ips), host)
        var ip string
        for _, i := range ips {
                log.Debugf(i.String())
                if i.To4() != nil {
                        ip = i.String()
                }
        }

        if ip == "" {
                return nil, errors.Errorf("IPV6 support not yet implemented")
        }

        vms, err := admiral.DefaultDiscovery.Discover(ctx, sess)
        if err != nil {
                return nil, errors.Errorf("failed to discover OVA vm(s): %s", err)
        }

        log.Infof("Found %d VM(s) tagged as OVA", len(vms))
        for i, v := range vms {
                log.Debugf("Checking IP for %s", v.Reference().Value)
                vmIP, err := v.WaitForIP(ctx)
                if err != nil && i == len(vms)-1 {
                        return nil, errors.Errorf("failed to get VM IP: %s", err)
                }

                // verify the tagged vm has the IP we expect
                if vmIP == ip {
                        log.Debugf("Found OVA with matching IP: %s", ip)
                        return v, nil
                }
        }

        return nil, errors.Errorf("no VM(s) found with OVA tag")
}

It looks like it's sitting there hanging in the call to WaitForIP, which lives down in the govmomi layer. That it occurs both with IP and FQDN suggests to me that it could be something on the VC side (that particular VC?). In all my tests, I have never seen the WaitForIP step take more than a second or so.

jzt commented 6 years ago

We should probably change the logging in there to Info or possibly add a few more lines to inform the user what is going on.

EDIT: Specifically changing the log.Debugf("Checking IP for %s", v.Reference().Value) to Info level.

jak-atx commented 6 years ago

@andrewtchin if we don't have the IP by the time the user interacts with the plugin something went awry and there is no point in continually retrying from plugin side. That's not something we do anyway typically because it's continual network requests and generally slows the browser way down. As @jzt stated it seems to be further upstream and could use some better error handling.

Can you figure out who this should be assign to @andrewtchin?

jak-atx commented 6 years ago

Added https://github.com/vmware/vic-ui/issues/213 to track handling getting a null value on the client side and displaying appropriate error messaging.

jak-atx commented 6 years ago

@gigawhitlocks can you confirm you were not seeing this after RC3? If it's still an issue it needs to be investigated on OVA side. All we can do on client is intercept the null and display an error.

andrewtchin commented 6 years ago

We will add a log for this, but the env is gone and we haven't been able to repro it. If we see it in the future we should check to see if the appliance reports an IP in vCenter from toolbox.

gigawhitlocks commented 6 years ago

In RC3 this error does not happen. Instead I get the normal failure for being unable to verify the cert (because the FQDN of the VCH is not used, and the cert is signed for the FQDN, not the IP) and I have to click through to accept the certificate, despite using a signed certificate. That said, it finds the IP correctly and I can accept the cert and click 'refresh' and that all works as it should.

This downgrades the issue from being a show-stopper to one that is a minor annoyance.

andrewtchin commented 6 years ago

And the user is prompted so that they know to use click to accept the cert? Also as a note the cause of that could be someone doesn't trust the LE root or we're not performing cert validation correctly.

gigawhitlocks commented 6 years ago

Yes, it alerts the user.

And no, the reason that the cert shows as untrusted is because the UI provides a link to access it as https://IP_ADDRESS and it should be https://DOMAIN because that's what the certificate is issued for. The certificate doesn't contain IP SANs because the VCH is assigned a dynamic IP, and so the certificate is only valid for the FQDN and not the IP of the server.

gigawhitlocks commented 6 years ago

screen shot 2017-12-11 at 4 27 01 pm

This is the same installation. Notice that beautiful green lock and no little warning saying I've saved this certificate as trusted. That's because I'm accessing via the domain name.

Now accessing via the IP address, the warning symbol is there, to indicate that I overrode the warning:

screen shot 2017-12-11 at 4 28 21 pm

gigawhitlocks commented 6 years ago

The IP and FQDN for a deployment are not interchangeable and the product shouldn't treat them as though they are. The biggest issue that all of this brings up is that in some places FQDNs get translated into IPs and then those IPs are stored and used instead of FQDNs later on. This will cause problems with access if, e.g., a customer provides an FQDN because the IP of the component being accessed may change.

omniproc commented 6 years ago

I got the same issue but in my case it's do to the fact that the requirement to open port 8443 for Cloud Admin is nowhere documented.

From what the error sais it seems that 8443 should be reachable from the machine the Cloud Admin is using to access the vSphere web client.

Suggested fix(es):

1.) document the requirement for port 8443 2.) build in a proxy function to the vSphere plugin so the connection to 8443 is established from vCenter (static) rather then from the Cloud Admin machine (variable)

andrewtchin commented 6 years ago

@m451 Thanks for your report - I opened https://github.com/vmware/vic-product/issues/1517 for updating the diagram Also you can see the networking requirements here https://vmware.github.io/vic-product/assets/files/html/1.3/vic_vsphere_admin/security_reference.html

zjs commented 6 years ago

In my case, I discovered that the root cause of my error was having an old, powered off OVA as a part of my vSphere inventory following upgrade. It seems unlikely that this is always the cause for this error message, but is certainly something we should better handle.