google-research / FLAN

Apache License 2.0
1.47k stars 155 forks source link

All attempts to get a Google authentication bearer token failed #40

Closed Zcchill closed 1 year ago

Zcchill commented 1 year ago

I create a new environment with python3.8 and have installed all of the packages showed in requirements.txt and run PYTHONPATH=. python flan/v2/run_example.py, however, I failed for the following reasons:

2023-03-17 15:00:13.007874: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2023-03-17 15:00:13.007925: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2023-03-17 15:00:13.008442: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-17 15:00:16.014237: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Not found: Error executing an HTTP request: HTTP response code 410 with body '<html>
  <head>
    <meta http-equiv='refresh' content='0; url=http://metadata/' />
  </head>
</html>
'".
2023-03-17 15:01:17.240986: E tensorflow/core/platform/cloud/curl_http_request.cc:614] The transmission  of request 0x558cf720ec00 (URI: https://www.googleapis.com/storage/v1/b/t5-data/o/vocabs%2Fcc_all.32000%2Fsentencepiece.model?fields=size%2Cgeneration%2Cupdated) has been stuck at 0 of 0 bytes for 61 seconds and will be aborted. CURL timing information: lookup time: 0.046775 (No error), connect time: 0 (No error), pre-transfer time: 0 (No error), start-transfer time: 0 (No error)
2023-03-17 15:02:18.712983: E tensorflow/core/platform/cloud/curl_http_request.cc:614] The transmission  of request 0x558cf720ec00 (URI: https://www.googleapis.com/storage/v1/b/t5-data/o/vocabs%2Fcc_all.32000%2Fsentencepiece.model?fields=size%2Cgeneration%2Cupdated) has been stuck at 0 of 0 bytes for 61 seconds and will be aborted. CURL timing information: lookup time: 0.044402 (No error), connect time: 0 (No error), pre-transfer time: 0 (No error), start-transfer time: 0 (No error)
2023-03-17 15:03:20.425956: E tensorflow/core/platform/cloud/curl_http_request.cc:614] The transmission  of request 0x558cf720ec00 (URI: https://www.googleapis.com/storage/v1/b/t5-data/o/vocabs%2Fcc_all.32000%2Fsentencepiece.model?fields=size%2Cgeneration%2Cupdated) has been stuck at 0 of 0 bytes for 61 seconds and will be aborted. CURL timing information: lookup time: 0.032221 (No error), connect time: 0 (No error), pre-transfer time: 0 (No error), start-transfer time: 0 (No error)
Traceback (most recent call last):
  File "flan/v2/run_example.py", line 93, in <module>
    dataset = selected_mixture.get_dataset(
  File "xx/lib/python3.8/site-packages/seqio/dataset_providers.py", line 1278, in get_dataset
    self._check_compatible_features()
  File "xx/lib/python3.8/site-packages/seqio/dataset_providers.py", line 1235, in _check_compatible_features
    if task.output_features[name].vocabulary != feature.vocabulary:
  File "xx/lib/python3.8/site-packages/seqio/vocabularies.py", line 330, in __eq__
    their_md5 = hashlib.md5(other.sp_model).hexdigest()
  File "xx/lib/python3.8/site-packages/seqio/vocabularies.py", line 248, in sp_model
    self._load_model()
  File "xx/lib/python3.8/site-packages/seqio/vocabularies.py", line 216, in _load_model
    self._sp_model = f.read()
  File "xx/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 119, in read
    length = self.size() - self.tell()
  File "xx/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 98, in size
    return stat(self.__name).length
  File "xx/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 871, in stat
    return stat_v2(filename)
  File "xx/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 887, in stat_v2
    return _pywrap_file_io.Stat(compat.path_to_str(path))
KeyboardInterrupt

And the intalled packages are shown below:(cuda 11.6)

Package                  Version
------------------------ ---------------------
absl-py                  0.12.0
astunparse               1.6.3
attrs                    21.2.0
Babel                    2.9.1
cachetools               4.2.2
certifi                  2021.5.30
charset-normalizer       2.0.4
clang                    5.0
click                    8.0.1
colorama                 0.4.4
dill                     0.3.4
editdistance             0.5.3
filelock                 3.0.12
flatbuffers              1.12
frozendict               2.3.5
future                   0.18.2
gast                     0.4.0
gin-config               0.4.0
google-auth              1.35.0
google-auth-oauthlib     0.4.5
google-pasta             0.2.0
googleapis-common-protos 1.53.0
grpcio                   1.39.0
h5py                     3.1.0
huggingface-hub          0.0.12
idna                     3.2
importlib-resources      5.12.0
iniconfig                1.1.1
joblib                   1.0.1
keras                    2.6.0
Keras-Preprocessing      1.1.2
Levenshtein              0.13.0
Markdown                 3.3.4
mesh-tensorflow          0.1.19
nltk                     3.6.2
numpy                    1.19.5
oauthlib                 3.1.1
opt-einsum               3.3.0
packaging                21.0
pandas                   1.3.2
pip                      23.0.1
pluggy                   0.13.1
portalocker              2.3.0
promise                  2.3
protobuf                 3.17.3
py                       1.10.0
pyasn1                   0.4.8
pyasn1-modules           0.2.8
pyparsing                2.4.7
pytest                   6.2.4
python-dateutil          2.8.2
pytz                     2021.1
PyYAML                   5.4.1
regex                    2021.8.3
requests                 2.26.0
requests-oauthlib        1.3.0
rouge-score              0.0.4
rsa                      4.7.2
sacrebleu                2.0.0
sacremoses               0.0.45
scikit-learn             0.24.2
scipy                    1.7.1
sentencepiece            0.1.96
seqio                    0.0.6
setuptools               67.6.0
six                      1.15.0
t5                       0.9.2
tabulate                 0.8.9
tensorboard              2.6.0
tensorboard-data-server  0.6.1
tensorboard-plugin-wit   1.8.0
tensorflow               2.6.0
tensorflow-datasets      4.4.0
tensorflow-estimator     2.6.0
tensorflow-hub           0.12.0
tensorflow-metadata      1.2.0
tensorflow-text          2.6.0
termcolor                1.1.0
tfds-nightly             4.4.0.dev202108200109
threadpoolctl            2.2.0
tokenizers               0.10.3
toml                     0.10.2
torch                    1.9.0
tqdm                     4.62.1
transformers             4.9.2
typing-extensions        3.7.4.3
urllib3                  1.26.6
Werkzeug                 2.0.1
wheel                    0.40.0
wrapt                    1.12.1
yapf                     0.32.0
zipp                     3.15.0

How could I solve this problem? Is this a network error or python environmental support problem?

shayne-longpre commented 1 year ago

@Zcchill I'm not sure what would cause this. I'm submitting an issue to Tensorflow Datasets to investigate. You may want to try to downloading the prerequisite raw datasets using tf.load() as suggested in #37, in case the problem is related to that issue.

adarob commented 1 year ago

Strange that it's trying to download that vocabulary. I don't see it referenced in your code.

adarob commented 1 year ago

Ah, it is referenced here: https://github.com/google-research/FLAN/blob/8e6ee0f3c1cc0184f8237997200ccb7bcbc6c40a/flan/v2/constants.py#L31

I'm not sure why that file would not be accessible to you though. We use this same code in many places.

renmengjie7 commented 1 year ago

I have met the same problem, unable to get the metadata

2023-05-02 20:02:02.843261: W tensorflow/tsl/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "ABORTED: All 10 retry attempts failed. The last failure: Error executing an HTTP request: HTTP response code 502".

How did you solve it

gxxu-ml commented 1 year ago

My error: All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".

gxxu-ml commented 1 year ago

export CURL_CA_BUNDLE=/etc/ssl/certs/ca-bundle.crt solved it

vince62s commented 1 year ago

2023-06-23 15:49:57.853557: W tensorflow/tsl/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata.google.internal".