GoogleCloudPlatform / gsutil

A command line tool for interacting with cloud storage services.
Apache License 2.0
876 stars 336 forks source link

frequent intermittent "Your "OAuth 2.0 Service Account" credentials are invalid" error message since gsutil 4.27 #446

Open ixdy opened 7 years ago

ixdy commented 7 years ago

In our CI test jobs, we activate a service account with gcloud auth activate-service-account --key-file=[file] and then do a bunch of gcloud and gsutil operations.

This usually works, but we've started seeing a fairly high rate of failures from a gsutil -qm cp -r /dir1 /dir2 /file1 /file3 ... gs://a/path call, and the failures seem to correlate with when we updated to gsutil 4.27.

The failures either look like

I0720 18:45:12.859] Your "OAuth 2.0 Service Account" credentials are invalid. Please run
I0720 18:45:12.860]   $ gcloud auth login
I0720 18:45:12.860] Duplicate type [0:0:4]
I0720 18:45:24.339] CommandException: 1 file/object could not be transferred.

or

I0725 20:21:47.240] Your "OAuth 2.0 Service Account" credentials are invalid. Please run
I0725 20:21:47.240]   $ gcloud auth login
I0725 20:21:47.242] Type position out of range
I0725 20:21:47.246] Your "OAuth 2.0 Service Account" credentials are invalid. Please run
I0725 20:21:47.247]   $ gcloud auth login
I0725 20:21:47.247] Type position out of range
I0725 20:21:58.707] CommandException: 2 files/objects could not be transferred.

The Duplicate type [0:0:4] and Type position out of range errors here seem very suspicious.

x-ref https://github.com/kubernetes/kubernetes/issues/49320

ixdy commented 7 years ago

output of gcloud info (this is inside a docker image, hence the weird paths):

Google Cloud SDK [163.0.0]

Platform: [Linux, x86_64] ('Linux', '7e9ec6cde4d1', '4.4.0-83-generic', '#106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017', 'x86_64', '')
Python Version: [2.7.9 (default, Jun 29 2016, 13:08:31)  [GCC 4.9.2]]
Python Location: [/usr/bin/python2]
Site Packages: [Disabled]

Installation Root: [/google-cloud-sdk]
Installed Components:
  kubectl: []
  core: [2017.07.17]
  gcloud: []
  beta: [2017.03.24]
  gsutil: [4.27]
  bq: [2.0.24]
  alpha: [2017.03.24]
System PATH: [/google-cloud-sdk/bin:/workspace:/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin]
Python PATH: [/google-cloud-sdk/lib/third_party:/google-cloud-sdk/lib:/usr/lib/python2.7/:/usr/lib/python2.7/plat-x86_64-linux-gnu:/usr/lib/python2.7/lib-tk:/usr/lib/python2.7/lib-old:/usr/lib/python2.7/lib-dynl]
Cloud SDK on PATH: [True]
Kubectl on PATH: [/google-cloud-sdk/bin/kubectl]

Installation Properties: [/google-cloud-sdk/properties]
User Config Directory: [/root/.config/gcloud]
Active Configuration Name: [default]
Active Configuration Path: [/root/.config/gcloud/configurations/config_default]

Account: [None]
Project: [None]

Current Properties:
  [core]
    disable_prompts: [1]
    disable_usage_reporting: [True]

Logs Directory: [/root/.config/gcloud/logs]
Last Log File: [/root/.config/gcloud/logs/2017.07.25/07.48.00.717410.log]

git: [git version 2.1.4]
ssh: [OpenSSH_6.7p1 Debian-5+deb8u3, OpenSSL 1.0.1t  3 May 2016]
rmmh commented 7 years ago

Both "Duplicate type" and "Type position out of range" messages come from the pyasn1 library, which was updated to 0.2.3 in 88964576c9bdbd8549d54cd7beb1bd9829752504, which is in v4.27 but not v4.26.

houglum commented 7 years ago

Hmm... it's possible that the update to 0.2.3 could be causing this. Looking at CheckAndGetCredentials() in gcs_json_credentials.py, we end up invoking ServiceAccountCredentials.from_json_keyfile_dict() from oauth2client, which uses a crypt module that relies on pyasn1.

If you run gsutil with -DD, we'll dump stacktrace info from the failed attempt to get credentials. That output may help confirm or rule out pyasn1 as the cause. (If you post any such output here, please remember to strip out any OAuth tokens or other sensitive data.)

rmmh commented 7 years ago

Here's a (slightly) cleaned stacktrace from this failure:

$ /google-cloud-sdk/bin/gsutil -qm cp -r /go/src/k8s.io/kubernetes/_output/gcs-stage/* gs://kubernetes-release-pull/ci/pull-kubernetes-e2e-gce-etcd3/v1.8.0-alpha.2.906+848d415b0a1d31/
Your "OAuth 2.0 Service Account" credentials are invalid. Please run
  $ gcloud auth login
Traceback (most recent call last):
  File "/google-cloud-sdk/platform/gsutil/gslib/gcs_json_credentials.py", line 96, in CheckAndGetCredentials
    service_account_creds = _GetOauth2ServiceAccountCredentials()
  File "/google-cloud-sdk/platform/gsutil/gslib/gcs_json_credentials.py", line 198, in _GetOauth2ServiceAccountCredentials
    json_key_dict, scopes=DEFAULT_SCOPES, token_uri=provider_token_uri)
  File "/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/service_account.py", line 264, in from_json_keyfile_dict
    revoke_uri=revoke_uri)
  File "/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/service_account.py", line 196, in _from_parsed_json_keyfile
    signer = crypt.Signer.from_string(private_key_pkcs8_pem)
  File "/google-cloud-sdk/platform/gsutil/third_party/oauth2client/oauth2client/_pure_python_crypt.py", line 179, in from_string
    pkey_info = key_info.getComponentByName('privateKey')
  File "/google-cloud-sdk/platform/gsutil/third_party/pyasn1/pyasn1/type/univ.py", line 2004, in getComponentByName
    self._componentType.getPositionByName(name)
  File "/google-cloud-sdk/platform/gsutil/third_party/pyasn1/pyasn1/type/namedtype.py", line 176, in getPositionByName
    nameToPosIdx = self.__getNameToPosIdx()
  File "/google-cloud-sdk/platform/gsutil/third_party/pyasn1/pyasn1/type/namedtype.py", line 171, in __getNameToPosIdx
    raise error.PyAsn1Error('Duplicate name %s' % (n,))
PyAsn1Error: Duplicate name attributes

I used sed -i s/logger.warn/logger.exception/ /google-cloud-sdk/platform/gsutil/gslib/gcs_json_credentials.py' to get this trace.

rmmh commented 7 years ago

I traced the problem down to a non-thread-safe function in pyasn1. This happens for PKCS8 certificates (the default for service account JSON) when the OpenSSL and pycrypto libraries are missing, and oauth2client falls back to the _pure_python_crypt.py code that uses pyasn1 to parse the certificates.

We could throw a lock around this code (maybe only in the case that it's using the fallback code?) or initialize credentials earlier (before forking for -m).

houglum commented 7 years ago

Nice find, well done! If a newer version than v0.3.1 (i.e. a version that includes the fix that the pyasn1 maintainers mentioned in the above issue) isn't available by the time we release gsutil 4.28, we'll probably revert to bundling the version we had previously (v0.1.9).

In the mean time, since it looks like gsutil v4.26 was working for you, is it possible for you to pin to that version for your CI test jobs?

rmmh commented 7 years ago

We're using the simpler workaround of installing python-openssl in our CI environment, so the pure-python parsing in pyasn1 isn't used by oauth2client.

ttomsu commented 7 years ago

We've been hitting this exact bug for a few days now, too. (great bug report @ixdy & nice find @rmmh!). Any ETA on the fix?

houglum commented 7 years ago

We just released gsutil 4.28 today, which includes an updated version of pyasn that should contain the fix. It won't be available in gcloud-packaged installs until ~next week, but if you're installing from tarball or pip, you can go ahead and grab the newest version.