NVIDIA / ngc-container-replicator

NGC Container Replicator
BSD 3-Clause "New" or "Revised" License
28 stars 12 forks source link

Issue with API? #30

Closed blairjj closed 2 years ago

blairjj commented 3 years ago

Howdy,

We have a new process that uses this container to check and pull new versions Nvidia Containers nightly. It stopped working last Thusday night. I have tried everything, including pulling the container locally, and no matter what I still get errors show below:

[appman@hpctest-ngc scripts]$ docker image list REPOSITORY TAG IMAGE ID CREATED SIZE deepops/replicator latest ded4e6170335 2 weeks ago 504MB docker latest 51453dcdd9bd 5 weeks ago 215MB ubuntu latest 1318b700e415 7 weeks ago 72.8MB centos latest 831691599b88 15 months ago 215MB

[appman@hpctest-ngc scripts]$ docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/output deepops/replicator --image=pytorch --dry-run --strict-name-match --api-key= 2021-09-13 21:05:48,675 - ngc_replicator.ngc_replicator - 30 - INFO - Initializing Replicator 2021-09-13 21:05:49,865 - nvidia_deepops.docker.registry.ngcregistry - 126 - INFO - GET https://api.ngc.nvidia.com/v2/orgs - took 0.6250609308481216 sec WARNING! Using --password via the CLI is insecure. Use --password-stdin. Login Succeeded 2021-09-13 21:05:51,624 - ngc_replicator.ngc_replicator - 66 - INFO - tarfiles will be saved to /output 2021-09-13 21:05:51,624 - ngc_replicator.ngc_replicator - 70 - INFO - Replicator initialization complete 2021-09-13 21:05:51,624 - ngc_replicator.ngc_replicator - 87 - INFO - Replicator Started 2021-09-13 21:06:22,168 - nvidia_deepops.docker.registry.ngcregistry - 126 - INFO - GET https://api.ngc.nvidia.com/v2/org/ygwdl2o5rmaj/repos?include-teams=true&include-public=true - took 30.543361625634134 sec Traceback (most recent call last): File “/usr/local/bin/ngc_replicator”, line 33, in sys.exit(load_entry_point(‘ngc-replicator==0.4.0’, ‘console_scripts’, ‘ngc_replicator’)()) File “/usr/local/lib/python3.6/site-packages/click/core.py”, line 722, in call return self.main(args, kwargs) File “/usr/local/lib/python3.6/site-packages/click/core.py”, line 697, in main rv = self.invoke(ctx) File “/usr/local/lib/python3.6/site-packages/click/core.py”, line 895, in invoke return ctx.invoke(self.callback, ctx.params) File “/usr/local/lib/python3.6/site-packages/click/core.py”, line 535, in invoke return callback(args, **kwargs) File “/usr/local/lib/python3.6/site-packages/ngc_replicator-0.4.0-py3.6.egg/ngc_replicator/ngc_replicator.py”, line 371, in main replicator.sync() File “/usr/local/lib/python3.6/site-packages/ngc_replicator-0.4.0-py3.6.egg/ngc_replicator/ngc_replicator.py”, line 90, in sync new_images = {image.name: image.tag for image in self.sync_images(project=project)} File “/usr/local/lib/python3.6/site-packages/ngc_replicator-0.4.0-py3.6.egg/ngc_replicator/ngc_replicator.py”, line 90, in new_images = {image.name: image.tag for image in self.sync_images(project=project)} File “/usr/local/lib/python3.6/site-packages/ngc_replicator-0.4.0-py3.6.egg/ngc_replicator/ngc_replicator.py”, line 106, in sync_images for image in self.images_to_download(project=project): File “/usr/local/lib/python3.6/site-packages/ngc_replicator-0.4.0-py3.6.egg/ngc_replicator/ngc_replicator.py”, line 127, in images_to_download remote_state = self.nvcr.get_state(project=project, filter_fn=filter_fn) File “/usr/local/lib/python3.6/site-packages/nvidia_deepops-0.4.2-py3.6.egg/nvidia_deepops/docker/registry/ngcregistry.py”, line 255, in get_state names = self.get_image_names(project=project) File “/usr/local/lib/python3.6/site-packages/nvidia_deepops-0.4.2-py3.6.egg/nvidia_deepops/docker/registry/ngcregistry.py”, line 204, in get_image_names for image in cache or self._get_repo_data(project=project)] File “/usr/local/lib/python3.6/site-packages/nvidia_deepops-0.4.2-py3.6.egg/nvidia_deepops/docker/registry/ngcregistry.py”, line 190, in _get_repo_data .format(self.default_org)) File “/usr/local/lib/python3.6/site-packages/nvidia_deepops-0.4.2-py3.6.egg/nvidia_deepops/docker/registry/ngcregistry.py”, line 136, in _get req.raise_for_status() File “/usr/local/lib/python3.6/site-packages/requests/models.py”, line 953, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: for url: https://api.ngc.nvidia.com/v2/org/ygwdl2o5rmaj/repos?include-teams=true&include-public=true

CalSimmon commented 2 years ago

I'm running into this issue as well. Is there any sort of fix for it?

blairjj commented 2 years ago

I got around it by just using the NGC CLI version and creating my own scripts to query the NGC inventory nightly and pull containers I want/need on a regular basis with crontab.

blairjj commented 2 years ago

After year pretty obvious this is not getting addressed. All that come this way in the future, just create your own process using the Nvidia NGC CLI. A container is only as good as it's maintainer.

ryanolson commented 2 years ago

Thanks all for the feedback.

This project was originally created to get around some of the limitations of the NGC CLI. Both the NGC CLI and the tooling has greatly improved. @blairjj - I'm glad you found a nice work around.

It's probably time we archived this project as there are now better ways to replicate the repo.