tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.39k stars 3.48k forks source link

Can't generate data for problem algorithmic_math_two_variables #1403

Open william-r-s opened 5 years ago

william-r-s commented 5 years ago

Description

Can't generate data for problem algorithmic_math_two_variables. Looks like there is trouble downloading https://art.wangperawong.com/mathematical_language_understanding_train.tar.gz

Environment information

OS: ubuntu

$ pip freeze | grep tensor
mesh-tensorflow==0.0.5
-e git+https://github.com/tensorflow/tensor2tensor.git@ee64f6f29884aa66c98130afaf7e4eb182e0ca1f#egg=tensor2tensor
tensorboard==1.12.0
tensorboard-logger==0.1.0
tensorflow-gpu==1.12.0
tensorflow-metadata==0.9.0
tensorflow-probability==0.5.0

$ python -V
Python 3.6.7 :: Anaconda, Inc.

For bugs: reproduction and error logs

# Steps to reproduce:
t2t-datagen   --data_dir=MLU/data \
  --output_dir=MLU/output \
  --problem=algorithmic_math_two_variables

# Error logs:
INFO:tensorflow:Generating problems:
    algorithmic:
      * algorithmic_math_two_variables
INFO:tensorflow:Generating data for algorithmic_math_two_variables.
INFO:tensorflow:Downloading https://art.wangperawong.com/mathematical_language_understanding_train.tar.gz to /tmp/t2t_datagen/mathematical_language_understanding_train.tar.gz
Traceback (most recent call last):
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/generator_utils.py", line 233, in maybe_download
    tf.gfile.Copy(uri, filepath)
  File "/h/william/conda/default/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 397, in copy
    compat.as_bytes(oldpath), compat.as_bytes(newpath), overwrite, status)
  File "/h/william/conda/default/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 'https' not implemented (file: 'https://art.wangperawong.com/mathematical_language_understanding_train.tar.gz')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/h/william/conda/default/bin/t2t-datagen", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/h/william/code/math/tensor2tensor/tensor2tensor/bin/t2t-datagen", line 28, in <module>
    tf.app.run()
  File "/h/william/conda/default/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "/h/william/code/math/tensor2tensor/tensor2tensor/bin/t2t-datagen", line 23, in main
    t2t_datagen.main(argv)
  File "/h/william/code/math/tensor2tensor/tensor2tensor/bin/t2t_datagen.py", line 198, in main
    generate_data_for_registered_problem(problem)
  File "/h/william/code/math/tensor2tensor/tensor2tensor/bin/t2t_datagen.py", line 260, in generate_data_for_registered_problem
    problem.generate_data(data_dir, tmp_dir, task_id)
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/text_problems.py", line 335, in generate_data
    data_dir, tmp_dir, problem.DatasetSplit.TRAIN), all_paths)
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/generator_utils.py", line 165, in generate_files
    for case in generator:
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/text_problems.py", line 653, in text2text_generate_encoded
    for sample in sample_generator:
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/mathematical_language_understanding.py", line 101, in generate_samples
    tmp_dir, compressed_filename, self.URL)
  File "/h/william/code/math/tensor2tensor/tensor2tensor/data_generators/generator_utils.py", line 238, in maybe_download
    uri, inprogress_filepath, reporthook=download_report_hook)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 532, in open
    response = meth(req, response)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 570, in error
    return self._call_chain(*args)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/h/william/conda/default/lib/python3.6/urllib/request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
artitw commented 5 years ago

I have fixed the downloading in this pull request: https://github.com/tensorflow/tensor2tensor/pull/1442

Please give it a try and let us know if it works for you!