Closed benofben closed 7 years ago
Suggestions from Kyle on where to get this info: https://cloud.google.com/deployment-manager/runtime-configurator/ https://cloud.google.com/compute/docs/storing-retrieving-metadata
rallyPrivateDNS - The metadata service doesn't have a way to get the IGM. I'm unclear on how to use the runtime configurator to resolve this issue.
nodePrivateDNS - I'm struggling with metadata service syntax using curl. I'll fiddle with this some more.
I've tried creating a runtime config, but and unsure of how to structure the variables in that. I sent an email to Chris asking for help.
Progress. We're creating it now...
Notes from Chris on how to do this:
So in the deployment you have the resource and you set the node count.
In the startup script you should have each VM write it's name to a variable. Using gcloud this would look like: gcloud beta runtime-config configs variables set node-list/nodeA nodea --is-text --config-name [resource config name]
Then the script would read the node count:
gcloud beta runtime-config configs variables get-value node-count --config-name [resource config name]
Then you need to read the list of nodes:
gcloud beta runtime-config configs variables list --filter=node-list --config-name [resource config name]
The last command above gives you the list of all nodes that have registered so far. Just sleep and loop until you the full list.
Sort and pick the first one.
You may have to do something to install gcloud on the VMs though.
Or you can call the api directly instead of using gcloud.
So, we're setting nodecount now. Let's try reading it from the script...
This relies on gcloud beta. Apparently we can on install this with auth preconfigured on deb 7, so we're using that now. Going to move back to Ubuntu 14 at a later date.
eesh.... we're installing the gcloud beta on deb 7 successfully now. That works as long as it happens in the startup script. However, the beta isn't authenticated, so it can't query the runtime config. I'm not sure this approach is going to work. Just sent an email to Kyle asking for help.
Ok. Apparently the gcloud beta is not the way to go. We're back to trying this with curl. Suggestion from Kyle is:
To bypass the auth/upgrade/configuration issues with gcloud I think it's time to give curl (or python/whatever) a shot and manage the access token directly. If the instances are created with a service account you can actually grab the authorization token from the compute instance metadata:
From there, I think you could use the API instructions in the Runtime Config documentation to make your requests:
It should flow the same as your gcloud commands, but we'll be removing a variable we don't have any control over (how (often) gcloud is authenticating).
I’m trying to use curl to grab a token. I tried running the commands manually in an ssh session and got:
ben_lackey@ben2-cluster1-group1-instance-0kmg:/var/log$ curl -s -H "Metadata-Flavor:Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token{"error":"invalid_request","error_description":"Service account not enabled on this instance"}ben_lackey@ben2-cluster1-group1-instance-0kmg:/var/log$
Given the “service account” error, I was thinking running as a startup script might work. I tried running this from the startup script: ACCESS_TOKEN=$(curl -s -H "Metadata-Flavor:Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token | awk -F\" '{ print $4 }') echo ACCESS_TOKEN: $ACCESS_TOKEN
That gave me this: Jun 8 16:37:08 ben2-cluster1-group1-instance-0kmg startupscript: ACCESS_TOKEN: invalid_request
I’m not really clear on how these curl commands would be run by a service account in the DM. It looks like that may be the topic of our discussion tomorrow.
Doc on service accounts is here: https://cloud.google.com/compute/docs/access/create-enable-service-accounts-for-instances
This is mostly working. We're getting 302 on the curl call because of DOS protection.
Seems resolved. Diagnosing as carry on DDOS issues. Closing this...
(1) Determine which IGM our rally point is in (2) Determine the first node of that IGM (3) Rally on that
Needs to be handled in the startup script. Likely a REST call.