weaveworks / service

☁️ Images for Weave Cloud (R) (TM) (C) ☁️
https://cloud.weave.works
2 stars 2 forks source link

Update GCP to use new Procurement API - Legacy API shutdown end of Feb 2019 #2367

Closed rndstr closed 5 years ago

rndstr commented 6 years ago

New API can be found at https://cloud.google.com/marketplace/docs/partners/integrated-saas/

Old implementation was based on this https://drive.google.com/open?id=0BzGS0TkuqfyZOGlGSGxzRFlieGtzY2V0RW83WmZSV2p6YUdN

rndstr commented 6 years ago

20181105 Meeting with Google (Tate)

Major differences:

Procurement API

PubSub API

Signup flow/SSO

ngehani commented 5 years ago

What do you estimate to make this change over @rndstr? Is this impacting people using GCP Marketplace to create Weave Cloud instances only (low priority for us)

rndstr commented 5 years ago

5d with relatively low complexity/risk. (8 story points)

Is this impacting people using GCP Marketplace to create Weave Cloud instances only

yes. well, not only creating but also using so with regard to the deadline this is high priority.

ngehani commented 5 years ago

Deadline is Feb 2019 so can be a post 1st of the year thing. There aren't that many people using this from what SG says. its in the backlog, we can add it to Next after the 1st of the year.

ngehani commented 5 years ago

@rndstr By using, you mean this will affect existing instances that are using GCP billing?

rndstr commented 5 years ago

By using, you mean this will affect existing instances that are using GCP billing?

yes

guyfedwards commented 5 years ago

I am trying to take a look at this but could do with some initial knowledge transfer with regards to billing. @rndstr or @marccarre are you able to give me some insight?

rndstr commented 5 years ago

@guyfedwards thanks! wrt billing, usage is uploaded in https://github.com/weaveworks/service/blob/master/billing-uploader/job/usage/gcp.go

then there is the gcp-launcher-webhook service that receives account/subscription updates: https://github.com/weaveworks/service/tree/master/gcp-launcher-webhook note that https://github.com/weaveworks/service/tree/master/gcp-launcher-webhook/handler is dead code which has been written in anticipation of moving to the new API (uses the terms 'entitlement') but it's not required to migrate to that, feel free to delete.

clients for the APIs can be found in https://github.com/weaveworks/service/tree/master/common/gcp

sso through gcp (note there is also the default login one we still want to support and it's in the same file/package) is taken care of in https://github.com/weaveworks/service/blob/master/users/login/google.go

we can also pair on this for an hour or two.

rndstr commented 5 years ago

also, for the actual migration questions we can do meetings with Google (Tate, tmandel@google.com) at anytime

rndstr commented 5 years ago

i found this doc: GCP Launcher links, demo, and local development which is inside the GCP Launcher Integration folder

marccarre commented 5 years ago

Sorry, late to the thread, so not sure what question remains outstanding as there were related discussions in Slack, but in any case:

More context, for posterity: ~1.5 year ago, it was decided GCP would have a marketplace, and it was decided Weave Cloud would appear there, so that users could easily subscribe to Weave Cloud via GCP (and be billed through it). "Usual story" applies: this was rushed and Google, at the time, couldn't provide us two environments (one to test, and one to run in production), so we manually tested against our only environment during the first iteration of the GCP integration.

ngehani commented 5 years ago
  • GCP, I suggest to tackle this first, to make further iterations easier & more robust.

We don't plan to do any more enhancements to GCP related work for now other than this issue.

marccarre commented 5 years ago

Sure, but testing this manually, in production is ineffective, risky, and poor on many levels. (The very fact that testing it needs to be explained to new engineers is yet another "smell" of the current state.)

Anyway, this is getting off topic, so maybe this should be discussed in a different issue. 🙂

rade commented 5 years ago

Who is actually working on this? Nobody is assigned :(

Roli's estimate was

5d with relatively low complexity/risk

Looks to me like we are already way above that, and AFAIK we haven't written a single line of code yet. cc @miklosp

Also, @ngehani did you consider shutting down the GCP integration and offering affected users a migration to ordinary WC subscriptions? It seems to me that would be less effort and a better long term solution.

cPu1 commented 5 years ago

@rade: @guyfedwards suggested that I look into this and I spent yesterday trying to test the GCP sign-up flow but it didn't entirely work because the test billing accounts are all closed. I was planning to look into the new Procurement API today and the changes required in code, but if we're planning to shut down the GCP integration, I'm not sure if I should.

We also have a call today for a discussion around this issue.

guyfedwards commented 5 years ago

@cPu1 and I have a call with @rndstr this afternoon(morning SF) to discuss billing local setup to try and get something that we can test against. Maybe it would be good to have a more general meeting about options and the current state of play?

@rade @ngehani @miklosp

rndstr commented 5 years ago

from today's meeting, TODO:

@marccarre

@rndstr

@guyfedwards @cPu1

marccarre commented 5 years ago

GCP instances that are tied to a corporate billing account

I've got a few GCP instances remaining, and I've invited you just in case, @rndstr, @guyfedwards, @cPu1, but not sure that will be of any use:

rndstr commented 5 years ago

Some more resources surfaced:

Python client https://github.com/googlecodelabs/gcp-marketplace-integrated-saas/blob/master/impl/step_5_entitlement_cancel/app.py

Step by step tutorial using the python client https://codelabs.developers.google.com/codelabs/gcp-marketplace-integrated-saas/#0

rndstr commented 5 years ago

I think we should now focus on the new API instead of trying to make the old one work.

guyfedwards commented 5 years ago

I have created a billing account called 'billing-test' and linked it to the 'weaveworks-dev' project.

Steps were:

  1. Get company card
  2. Get billing permissions for weave.works org in GCP (I hassled in #engineering, other methods available)
  3. Create the billing account with the card
  4. Go to: 'weaveworks-dev' project > Billing > Link billing account and choose the new billing account just created.
  5. Add other users who might need access to it as 'Billing admins' for that billing account
rndstr commented 5 years ago

I went to the weaveworks-dev weave-cloud solution (where the Procurement API is enabled): https://console.cloud.google.com/marketplace/details/weaveworks-dev/weave-cloud

And then hit subscribe on one of the plans it created an account and forwarded me to POST https://c.w.w/subscribe-via/gcp?gcpAccountId=E-8A9F-4E2B-32D7-AC96 so that's the account i'll hardcode for now to not approve entitlements during signup so we can get that tested/running.

~Once it forwarded into a popup, it returned a 403 (note that this url is handled in the frontend). But opening it by itself in a new tab worked, so I wonder~

Edit: it is a POST request, still not sure why there is a 403 (i would've expected a 404 or 405)

cPu1 commented 5 years ago

b) i thought we are getting a post request with JWT but i need to confirm this from doc

That's correct. The frontend integration doc mentions the sign-up will be a POST request with JWT.

When users choose your solution from GCP Marketplace, they are directed to a sign-up page that you create. In this sign-up page, they create an account in your system.

When users click the link to sign up, Google sends an HTTP POST request to your sign-up page, and sends a JSON Web Token (JWT) in the x-gcp-marketplace-token parameter. The JWT contains the user's procurement account ID, which identifies them as a Google Cloud Platform user. You must use this ID to link the user's Google account to their account in your system.

And then hit subscribe on one of the plans it created an account and forwarded me to GET https://c.w.w/subscribe-via/gcp?gcpAccountId=E-8A9F-4E2B-32D7-AC96 so that's the account i'll hardcode for now to not approve entitlements during signup so we can get that tested/running.

Doesn't this mean that it's somehow still using the old integration?

rndstr commented 5 years ago

Doesn't this mean that it's somehow still using the old integration?

It actually is a POST request, I looked in the wrong place and assumed a refresh would trigger a confirmation dialog (which it didn't).

The full request:

curl 'https://cloud.weave.works/subscribe-via/gcp?gcpAccountId=E-8A9F-4E2B-32D7-AC96' \
  -H 'Origin: https://cloud.weave.works'
  -H 'Content-Type: application/x-www-form-urlencoded'
  -H 'Referer: https://console.cloud.google.com/marketplace/createAccount/weaveworks-dev/weave-cloud?flavor=standard&project=weaveworks-dev&folder&organizationId=36144081350'
  -H 'Cookie:<redacted>'
  --data 'x-gcp-marketplace-token=eyJhbGciOiJSUzI1NiIsImtpZCI6IjhmZTU5ZmYwOGY0NDE2YjkxOTJjMDJhMzA1NmQ2ODFjZTUxYzZkOGMiLCJ0eXAiOiJKV1QifQ.eyJpYXQiOjE1NTA3NzMzMDAsImV4cCI6MTU1MDc3MzYwMCwiYXVkIjoiY2xvdWQud2VhdmUud29ya3MiLCJzdWIiOiJFLThBOUYtNEUyQi0zMkQ3LUFDOTYiLCJpc3MiOiJodHRwczovL3d3dy5nb29nbGVhcGlzLmNvbS9yb2JvdC92MS9tZXRhZGF0YS94NTA5L2Nsb3VkLWNvbW1lcmNlLXBhcnRuZXJAc3lzdGVtLmdzZXJ2aWNlYWNjb3VudC5jb20ifQ.TPz_dwWueJk1B5Idxfkhtlie-AkfK0v8zmpaskzFmqIGWgHb2PmWsJX-6g5ZcxHjwO6geGgepMitxfJtF7ThtJOKJFHjcoln-DxJfx276pyB1d3kOeWa_SUzDUiA6i31fhvcg3rAxf-he3SRUwduem0kLDMCPFWxPIFqD2Iqd7Wnfq-9VxncTCdWQDrwN5sHYt9t6LVepeGrF-fpxg_L3nO_aVhP7eOThr2PjdOkvpqgYe8A6aRAMClFlpPZbxlxlH1QF7pLIDz6juPp5Hfuqrn3KgUz_zTzN12GEYBELC0rwbJibLFHzho4Tp7am3FKbiPPKKPGH-FFUg9AhFMcvA'

I believe the gcpAccountId is still passed in for backward compatibility but they recommend to use the JWT token instead IIRC.

The x-gcp-marketplace-token b64 decodes to

{"alg":"RS256","kid":"8fe59ff08f4416b9192c02a3056d681ce51c6d8c","typ":"JWT"}
rndstr commented 5 years ago

Err, so the token appears to have 3 pieces separated by . (i only showed the first one in the prev comment)

{"alg":"RS256","kid":"8fe59ff08f4416b9192c02a3056d681ce51c6d8c","typ":"JWT"}
{
  "iat":1550773300,
  "exp":1550773600,
  "aud":"cloud.weave.works",
  "sub":"E-8A9F-4E2B-32D7-AC96",
  "iss":"https://www.googleapis.com/robot/v1/metadata/x509/cloud-commerce-partner@system.gserviceaccount.com"
}

and the last one is the signature.

the data can be easily extracted at https://jwt.io

cPu1 commented 5 years ago

The 403 error from POST /subscribe-via/gcp is from ~S3~ authfe. To keep the changes minimal, one approach to implement this is to have ~ui-server~ authfe handle POST requests to /subscribe-via/gcp and transform it into a GET request with the JWT URL-encoded in the query string (URL-encoding is required since some characters in Base64 aren't URL-safe), or the raw request body could be added to the query string directly as the Content-Type is application/x-www-form-urlencoded.

The downside of redirecting to GET is the JWT appearing in the query string and server logs showing up the JWT as part of the URL.

ngehani commented 5 years ago

@rndstr - Did you contact Google to confirm the soft or hard shutdown for the subscriptions API?

rndstr commented 5 years ago

@ngehani i did ask them in https://groups.google.com/a/weave.works/d/msg/gcp-launcher/JorLwtWDc_Q/DbAZ-vuBAwAJ but didn't get any response. i might mention it again today.

Update: they asked us to be done by March 8

rndstr commented 5 years ago

Trying to get the local Google oauth login working, getting this:

ERRO[0003] GET /api/users/logins/google/attach: oauth state value did not match 

Edit: I skipped the CSRF token verification and login works:

diff --git a/users/login/oauth.go b/users/login/oauth.go
index 59606b28..071f3b54 100644
--- a/users/login/oauth.go
+++ b/users/login/oauth.go
@@ -117,9 +117,8 @@ func (a *OAuth) VerifyState(r *http.Request) (map[string]string, bool) {
        if err != nil {
                return nil, false
        }
-       token := state["token"]
        delete(state, "token")
-       return state, nosurf.VerifyToken(csrfToken(r), token)
+       return state, true
 }
ngehani commented 5 years ago

@rade - Sorry, I missed this comment.

Also, @ngehani did you consider shutting down the GCP integration and offering affected users a migration to ordinary WC subscriptions? It seems to me that would be less effort and a better long term solution.

rndstr commented 5 years ago

Migration plan

Preparation

DEV

  1. [x] create pubsub subscription manually if not already existing
  2. turn off autodeploy and lock: gcp-launcher-webhook, service, service-ui, authfe
  3. merge PRs https://github.com/weaveworks/service-ui/pull/3669 and #2512 , wait until images built
  4. in service-conf, create PR with cli arg changes and new image tags for deployments https://github.com/weaveworks/service-conf/pull/3022
  5. deploy by merging the service-conf PR

Rollback is reverting the service-conf PR. note the db migrations will not be undone which means GCP communications are broken, which prevents from subscribing and uploading usage. (the only migration is renaming the subscription states but GCP interactions are minimal and not a big deal to stall for a couple minutes)

PROD

Otherwise, same as dev.