Google Kubernetes Engine Error: Connection Refused

🐛 Describe the bug

I'm attempting to set up a k8s cluster on GKE using the tutorial provided here. However, I found that a lot of the yaml files needed to be reworked, so I made some modifications and deployed it. I've also tested that torchserve is running locally from inside the pod by curl'ing localhost. Frustratingly, the ports exposed by the LoadBalancer are refusing any attempts to connect to it. I don't believe this is a fault of any GCP config on my end because I've been able to get the GKE quickstart deploy serving on the same cluster but through a different LoadBalancer.

The external ip is 35.192.6.11 - it'll be up for a while as I try to figure this out. Luckily the google has discounted the c3 instances to near nothing for the public preview.

I've uploaded my torchserve handler and config files and my GKE config files with a series of steps to reproduce.

Thank you very much for taking the time to look over this!!!!

Error logs

curl: (7) Failed to connect to port 8080 after 33 ms: Connection refused

Installation instructions

Install torchserve from source: no Using docker: no

but torchserve is serving on localhost fine.

Model Packaing

https://github.com/samuelzxu/trocr-serving

config.properties

default_workers_per_model=1

Versions

------------------------------------------------------------------------------------------
Environment headers
------------------------------------------------------------------------------------------
Torchserve branch: 

torchserve==0.7.1b20230208
torch-model-archiver==0.7.1b20230208

Python version: 3.10 (64-bit runtime)
Python executable: /home/ziggy/miniconda3/envs/ts-testing/bin/python

Versions of relevant python libraries:
numpy==1.24.2
psutil==5.9.0
requests==2.28.2
torch==2.0.0
torch-model-archiver==0.7.1b20230208
torch-workflow-archiver==0.2.7b20230208
torchserve==0.7.1b20230208
transformers==4.27.1
wheel==0.38.4
torch==2.0.0
**Warning: torchtext not present ..
**Warning: torchvision not present ..
**Warning: torchaudio not present ..

Java Version:

OS: Ubuntu 22.04.1 LTS
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
Clang version: 14.0.0-1ubuntu1
CMake version: version 3.22.1

Repro instructions

cd serve/kubernetes/GKE
gcloud config set compute/region us-west1
gcloud config set compute/zone us-west1-a

gcloud compute disks create --size=200GB --zone=us-west1-a nfs-disk
gcloud container clusters create torchserve --machine-type c3-standard-4 --num-nodes 2
gcloud container clusters get-credentials openmind
cd GKE
helm install mynfs ./nfs-provisioner/

kubectl get svc -n default mynfs-nfs-provisioner -o jsonpath='{.spec.clusterIP}'
# Copy it the ip address over, and then...

kubectl apply -f templates/pv_pvc.yaml -n default
kubectl apply -f templates/pod.yaml

kubectl exec --tty pod/model-store-pod -- mkdir /pv/model-store/
kubectl cp ./trocr-handwritten.mar model-store-pod:/pv/model-store/trocr-handwritten.mar

kubectl exec --tty pod/model-store-pod -- mkdir /pv/config/
kubectl cp ./config.properties model-store-pod:/pv/config/config.properties

kubectl exec --tty pod/model-store-pod -- ls -lR /pv/
kubectl delete po model-store-pod

cd ../Helm
helm install ts .

Possible Solution

nmap output:


$ nmap -Pn 35.192.6.11
Starting Nmap 7.80 ( https://nmap.org ) at 2023-03-20 17:38 EDT
Nmap scan report for 11.6.192.35.bc.googleusercontent.com (35.192.6.11)
Host is up (0.031s latency).
Not shown: 997 filtered ports
PORT     STATE  SERVICE
8080/tcp closed http-proxy
8081/tcp closed blackice-icecap
8082/tcp closed blackice-alerts

Nmap done: 1 IP address (1 host up) scanned in 8.87 seconds


- VPN: I'm getting an empty response from the curl to port 8080 when I am connected to a vpn, but connection refused when I'm on my home network.
- VPC Firewall changes: allowing connection to `http-server` and `https-server` on ports 8080-8082 with priority 0. 
- Configuration changes: reinstalling, changing region, changing compute resource

I believe the fault might lie with some of the configurations in the .yaml files (perhaps the loadbalancer service?) since everything else seems to be working fine, but I haven't been able to sniff it out.

pytorch / serve