aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
9.99k stars 6.74k forks source link

cannot find fairseq image when creating container #1701

Open 008karan opened 3 years ago

008karan commented 3 years ago

I am using fairseq_sagemaker_translate_en2fr notebook for training. When I am building a container for fairseq using command:

%%sh
chmod +x create_container.sh 

./create_container.sh pytorch-fairseq

getting:

Getting from region us-east-1 and account 198247849549
Sending build context to Docker daemon  16.84MB
Step 1/16 : FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04
The push refers to repository [198247849549.dkr.ecr.us-east-1.amazonaws.com/pytorch-fairseq]
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: Get https://198247849549.dkr.ecr.us-east-1.amazonaws.com/v2/: dial tcp: lookup 198247849549.dkr.ecr.us-east-1.amazonaws.com on [::1]:53: read udp [::1]:51282->[::1]:53: read: connection refused
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: Get https://520713654638.dkr.ecr.us-east-1.amazonaws.com/v2/: dial tcp: lookup 520713654638.dkr.ecr.us-east-1.amazonaws.com on [::1]:53: read udp [::1]:58091->[::1]:53: read: connection refused
Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:55097->[::1]:53: read: connection refused
Error response from daemon: No such image: pytorch-fairseq:latest
An image does not exist locally with the tag: 198247849549.dkr.ecr.us-east-1.amazonaws.com/pytorch-fairseq

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
<ipython-input-3-7de4343081ca> in <module>
----> 1 get_ipython().run_cell_magic('sh', '', 'chmod +x create_container.sh \n\n./create_container.sh pytorch-fairseq\n')
~/anaconda2/envs/xyz/sagemaker/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2380             with self.builtin_trap:
   2381                 args = (magic_arg_s, cell)
-> 2382                 result = fn(*args, **kwargs)
   2383             return result
   2384 
~/anaconda2/envs/xyz/sagemaker/lib/python3.7/site-packages/IPython/core/magics/script.py in named_script_magic(line, cell)
    140             else:
    141                 line = script
--> 142             return self.shebang(line, cell)
    143 
    144         # write a basic docstring:
<decorator-gen-103> in shebang(self, line, cell)
~/anaconda2/envs/xyz/sagemaker/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):
~/anaconda2/envs/xyz/sagemaker/lib/python3.7/site-packages/IPython/core/magics/script.py in shebang(self, line, cell)
    243             sys.stderr.flush()
    244         if args.raise_error and p.returncode!=0:
--> 245             raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
    246 
    247     def _run_script(self, p, cell, to_close):
CalledProcessError: Command 'b'chmod +x create_container.sh \n\n./create_container.sh pytorch-fairseq\n'' returned non-zero exit status 1.

Looks like it cannot find fairseq image. Any help here?

ngluna commented 3 years ago

FYI: This notebook is in https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/fairseq_translation

ngluna commented 3 years ago

https://github.com/aws/amazon-sagemaker-examples/issues/1428