m-lab / etl-gardener

Gardener provides services for maintaining and reprocessing mlab data.
Apache License 2.0
13 stars 5 forks source link

Confirm obsolete documentation #373

Open stephen-soltesz opened 2 years ago

stephen-soltesz commented 2 years ago

Text below this line was originally copied from the etl-gardener README.md. However, I believe it is out of date an no longer necessary. Removed from

TODO(soltesz): b/c v2 gardener & parsers are co-located in the GKE cluster,
these steps may be overly complex for what is now needed.

Kubernetes Cluster and Network

Gardener provides a Jobs API service to the ETL parsers. So the parsers can
access this service, we run gardener in a GKE cluster with a custom internal
network and reserve a static ip address for the Gardener service at 10.100.1.2.

The network and firewall rules are set up manually using:

gcloud --project=mlab-sandbox \                                                 
  compute networks create data-processing --subnet-mode=custom \                
  --description="Network for communication among backend processing services."  

gcloud --project=mlab-sandbox compute firewall-rules create dp-allow-ssh \      
  --network=data-processing --allow=tcp:22 --direction=INGRESS \                
  --description='Allow SSH from anywhere'                                       

gcloud --project=mlab-sandbox compute firewall-rules create \                   
  dp-allow-internal --network=data-processing \                                 
  --allow=tcp:0-65535,udp:0-65535,icmp --direction=INGRESS \                    
  --source-ranges=10.128.0.0/9,10.100.0.0/16 \                                  
  --description='Allow internal traffic from anywhere'                          

Then the subnet and the static IP address...

gcloud --project=mlab-sandbox \                                                 
  compute networks subnets create dp-gardener \                                 
  --network=data-processing --range=10.100.0.0/16 \                             
  --enable-private-ip-google-access --region=us-east1 \                         
  --description="Subnet for gardener,etl,annotation-service. Subnet has the same name and address range across projects, but each is in a distinct (data-processing) VPC network."

gcloud --project=mlab-sandbox compute addresses create etl-gardener \           
  --region=us-east1 --subnet=dp-gardener --addresses=10.100.1.2                 

Then create the cluster itself.

gcloud --project=mlab-sandbox container clusters create data-processing \       
  --region=us-east1 --enable-autorepair --enable-autoupgrade \                  
  --network=data-processing --subnetwork=dp-gardener \                          
  --scopes=bigquery,taskqueue,compute-rw,storage-ro,service-control,service-management,datastore \
  --num-nodes 2 --image-type=cos --machine-type=n1-standard-4 \                 
  --node-labels=gardener-node=true --labels=data-processing=true