apigee / devrel

Common solutions and tools developed for Apigee
Apache License 2.0
187 stars 160 forks source link

Errors and connection problem in hybrid-quickstart for INGRESS_TYPE=internal #483

Closed kurtkanaskie closed 2 years ago

kurtkanaskie commented 2 years ago

I saw a couple errors when running initialize-runtime-gke.sh with INGRESS_TYPE=internal. Plus I'm not able to connect to a proxy from a local VM using the generated EnvGroup hostname test-10-200-0-2.nip.io.

Errors in steps.sh configure_network() Line 321

gcloud compute addresses create apigee-ingress-ip --region "$REGION" --network "$NETWORK" --subnet "$SUBNET" --purpose SHARED_LOADBALANCER_VIP
ERROR: (gcloud.compute.addresses.create) arguments not allowed simultaneously: --network, --subnet

Changed to remove --network:

gcloud compute addresses create apigee-ingress-ip --region "$REGION" --subnet "$SUBNET" --purpose SHARED_LOADBALANCER_VIP

Line 340-ish

gcloud dns managed-zones create apigee-dns-zone --dns-name="$DNS_NAME" --description=apigee-dns-zone --visibility="private" --networks="default"

Changed network to be "$NETWORK"

gcloud dns managed-zones create apigee-dns-zone --dns-name="$DNS_NAME" --description=apigee-dns-zone --visibility="private" --networks="$NETWORK"

Once I fixed those, the install completed and I see:

$ gcloud compute addresses list
NAME                                    ADDRESS/RANGE   TYPE      PURPOSE                  NETWORK  REGION    SUBNET   STATUS
apigee-ingress-ip                       10.200.0.2      INTERNAL  SHARED_LOADBALANCER_VIP           us-east1  default  RESERVED

$ kubectl get svc -n istio-system
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                 AGE
istio-ingressgateway   ClusterIP   10.180.11.67    <none>        15021/TCP,443/TCP                       2d23h
istiod                 ClusterIP   10.180.2.85     <none>        15010/TCP,15012/TCP,443/TCP,15014/TCP   2d23h
istiod-asm-198-6       ClusterIP   10.180.10.211   <none>        15010/TCP,15012/TCP,443/TCP,15014/TCP   2d23h

$ kubectl get pods -n istio-system
NAME                                    READY   STATUS    RESTARTS   AGE
istio-ingressgateway-69588996d6-sk4l2   1/1     Running   0          2d23h
istio-ingressgateway-69588996d6-xf85q   1/1     Running   0          2d23h
istiod-asm-198-6-6b55d99697-rm4dd       1/1     Running   0          2d23h
istiod-asm-198-6-6b55d99697-x5wps       1/1     Running   0          2d23h

$ kubectl get svc -n apigee
NAME                                               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                       AGE
apigee-cassandra-default                           ClusterIP   None            <none>        9042/TCP,7199/TCP             2d22h
apigee-connect-agent-apigee-hybrid-i-8b12655       ClusterIP   10.180.11.119   <none>        443/TCP                       2d22h
apigee-mart-apigee-hybrid-i-8b12655                ClusterIP   10.180.2.194    <none>        8843/TCP                      2d22h
apigee-metrics-apigee-telemetry-app                ClusterIP   10.180.5.58     <none>        9090/TCP,9091/TCP             2d22h
apigee-metrics-apigee-telemetry-proxy              ClusterIP   10.180.3.182    <none>        9090/TCP,9091/TCP,19090/TCP   2d22h
apigee-redis-default                               ClusterIP   None            <none>        6379/TCP                      2d22h
apigee-redis-envoy-default                         ClusterIP   10.180.7.240    <none>        6379/TCP                      2d22h
apigee-runtime-apigee-hybrid-i-test-a8bf05f        ClusterIP   10.180.3.58     <none>        8443/TCP                      2d22h
apigee-synchronizer-apigee-hybrid-i-test-a8bf05f   ClusterIP   10.180.7.228    <none>        8843/TCP                      2d22h
apigee-udca-apigee-hybrid-i-test-a8bf05f           ClusterIP   10.180.2.163    <none>        20001/TCP                     2d22h
apigee-watcher-apigee-hybrid-i-8b12655             ClusterIP   10.180.9.234    <none>        8843/TCP                      2d22h

I can open a bash prompt on one of the ingressgateways and connect to a proxy on the ruyntime using:

$ kubectl -n istio-system exec -it istio-ingressgateway-69588996d6-sk4l2 -- bash
curl -k -v https://10.180.3.58:8443/notarget
*   Trying 10.180.3.58...
* TCP_NODELAY set
* Connected to 10.180.3.58 (10.180.3.58) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=apigee-hybrid-internal
*  start date: Mar 18 14:40:31 2022 GMT
*  expire date: Mar 15 14:40:31 2032 GMT
*  issuer: CN=apigeehybrid
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
> GET /notarget HTTP/1.1
> Host: 10.180.3.58:8443
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json
< user-agent: curl/7.58.0
< X-UpperCase: Camel Case Value
< X-Apigee-proxy: /organizations/apigee-hybrid-internal/environments/test/apiproxies/notarget/revisions/1
< X-Apigee-proxy-basepath: /notarget
< Content-Length: 957
< 

{
      "APIGEE_DPCOLOR":"165",
       "APIGEE_REGION":"us-east1",
           "messageid":"1eef502f-3e3d-4643-bf62-adbc434fd2956",
         "system.uuid":"fe7dee3e-f7f4-4b77-afa4-6c9beb675aab",
  "system.region.name":"us-east1",
   "organization.name":"apigee-hybrid-internal",
    "environment.name":"test",
    "virtualhost.name":"default",
     "x-forwarded-for":"",
         "client.host":"10.176.1.7",
           "client.ip":"10.176.1.7",
     "proxy.client.ip":"10.176.1.7",
       "apiproxy.name":"notarget",
   "apiproxy.revision":"1",
         "request.uri":"/notarget",
      "proxy.basepath":"/notarget",
    "proxy.pathsuffix":"",
     "request.version":"1.1",
 "request.transportid":"http",
             "request":"GET http://10.180.3.58:8443/notarget",
                "time":"Mon, 21 Mar 2022 13:17:11 GMT",
           "timestamp":"1647868631016",
     "time_now_utc_ms":"2022-03-21 13-17-11",
  "time_utc_ala_carte":"2022-3-21T13:17:11Z"
}

However from a local VM on the same network, I'm not able to access the proxy through the load balancer, it just hangs

$ curl -v https://test-10-200-0-2.nip.io/notarget
*   Trying 10.200.0.2...
* TCP_NODELAY set
* connect to 10.200.0.2 port 443 failed: Connection timed out
* Failed to connect to test-10-200-0-2.nip.io port 443: Connection timed out
* Closing connection 0
curl: (7) Failed to connect to test-10-200-0-2.nip.io port 443: Connection timed out

What steps can I take to debug / fix this? Should I be able to connect to the istio-ingressgateway ClusterIP 10.180.11.67

danistrebel commented 2 years ago

Thanks for the report Kurt! There's multiple things in this issue that I included in the PR above:

kurtkanaskie commented 2 years ago

Thanks Dan, I new that about the ILB, now that you mentioned it :) This gave me a chance to test destroying.

I ran ./destroy-runtime-gke.sh and saw some issues:

  1. There's no ask_confirm as I was expecting, that should certainly be there.
  2. The first ERROR is OK, that's for the global address
  3. The second ERROR is because load balancer is not deleted and therefore the cert can't be deleted.
  4. Not sure about the last 2 ERRORs
$ ./destroy-runtime-gke.sh 
πŸ“ Setting Config Parameters (Provide your own or defaults will be applied)
πŸ”§ Configuring GCP Project
- Project ID apigee-hybrid-internal
Updated property [core/project].
- Analytics Region us-east1
Updated property [compute/region].
Updated property [compute/zone].
- Compute Location us-east1/us-east1-b
- Network apigee-hybrid/default

πŸ”§ Apigee hybrid Configuration:
- Ingress type internal
- TLS Certificate google-managed
- GKE Node Type e2-standard-4
- Apigeectl version 1.6.5
- kpt version v0.34.0
- Cert Manager version v1.2.0
- ASM version 1.9
- 🍏 Using macOS binaries

πŸ”§ Derived config parameters
- GCP Project apigee-hybrid-internal
- Workload Pool apigee-hybrid-internal.svc.id.goog
- Mesh ID proj-304474196495
- Ingress IP 10.200.0.2
- Nameserver ns-gcp-private.googledomains.com.
- Script root from: /Users/kurtkanaskie/work/APIGEEX/apigee-hybrid-internal/devrel/tools/hybrid-quickstart
πŸ—‘οΈ Delete Apigee hybrid cluster
The following clusters will be deleted.
 - [apigee-hybrid] in [us-east1]

Do you want to continue (Y/n)?  
Deleting cluster apigee-hybrid...
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................done.
Deleted [https://container.googleapis.com/v1/projects/apigee-hybrid-internal/zones/us-east1/clusters/apigee-hybrid].
Deleted [https://www.googleapis.com/compute/v1/projects/apigee-hybrid-internal/zones/us-east1-b/disks/gke-apigee-hybrid-748c-pvc-992ab737-d6af-4188-b606-d0f60a9d26d3].
βœ… Apigee hybrid cluster deleted
πŸ—‘οΈ Clean up Networking
Deleted [https://www.googleapis.com/compute/v1/projects/apigee-hybrid-internal/regions/us-east1/addresses/apigee-ingress-ip].
ERROR: (gcloud.compute.addresses.delete) Could not fetch resource:
 - The resource 'projects/apigee-hybrid-internal/global/addresses/apigee-ingress-ip' was not found

No global IP address
Imported record-sets from [empty-file] into managed-zone [apigee-dns-zone].
Created [https://dns.googleapis.com/dns/v1/projects/apigee-hybrid-internal/managedZones/apigee-dns-zone/changes/2].
ID  START_TIME                STATUS
2   2022-03-22T09:53:47.994Z  pending
Deleted [https://dns.googleapis.com/dns/v1/projects/apigee-hybrid-internal/managedZones/apigee-dns-zone].
ERROR: (gcloud.compute.ssl-certificates.delete) Could not fetch resource:
 - The ssl_certificate resource 'projects/apigee-hybrid-internal/global/sslCertificates/mcrt-0e19b49b-f1d6-44fb-ab83-60fca43b8a4a' is already being used by 'projects/apigee-hybrid-internal/global/targetHttpsProxies/k8s2-um-a1m2547j-istio-syste39-target-proxy'

βœ… Apigee networking cleaned up
βœ… Tooling and Config removed
ERROR: (gcloud.iam.service-accounts.keys.list) NOT_FOUND: Unknown service account
ERROR: (gcloud.iam.service-accounts.keys.list) NOT_FOUND: Unknown service account
βœ… SA keys deleted
βœ… βœ… βœ… Clean up completed
kurtkanaskie commented 2 years ago

Thanks for the quick fix, I just tested your PR and it works!