tensorflow / models

Models and examples built with TensorFlow
Other
77.02k stars 45.78k forks source link

The 404 error is thrown when preparing imagenet data #8052

Open Eugene-Mark opened 4 years ago

Eugene-Mark commented 4 years ago

To reproduce the bug: 1, Follow Get Started to prepare image net data.

2, Run bazel-bin/inception/download_and_preprocess_imagenet "${DATA_DIR}"

3, The error like below is shown:

In order to download the imagenet data, you have to create an account with image-net.org. This will get you a username and an access key. You can set the IMAGENET_USERNAME and IMAGENET_ACCESS_KEY environment variables, or you can enter the credentials here. Username: username Access key: password Saving downloaded files to /root/image-data/raw-data/ Downloading bounding box annotations. --2020-01-16 17:29:43-- http://www.image-net.org/challenges/LSVRC/2012/nonpub/ILSVRC2012_bbox_train_v2.tar.gz Resolving xxx.proxy.com (xxx.proxy.com)... 10.239.5.5 Connecting to xxx.proxy.com (xxx.proxy.com)|10.239.5.5|:910... connected. Proxy request sent, awaiting response... 404 Not Found 2020-01-16 17:29:44 ERROR 404: Not Found.

--2020-01-16 17:29:44-- http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_bbox_train_v2.tar.gz Resolving xxx.proxy.com (xxx.proxyl.com)... 10.239.5.5 Connecting to xxx.proxy.com (xxx.proxy.com)|10.239.5.5|:910... connected. Proxy request sent, awaiting response... 404 Not Found 2020-01-16 17:29:45 ERROR 404: Not Found.

According to my understanding, it's because http://image-net.org/ has modified its resource location and access level.

tensorflowbutler commented 4 years ago

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks. What is the top-level directory of the model you are using Have I written custom code OS Platform and Distribution TensorFlow installed from TensorFlow version Bazel version CUDA/cuDNN version GPU model and memory Exact command to reproduce

Eugene-Mark commented 4 years ago

Sure, please check the update.

haydengunraj commented 4 years ago

I had the same issue, and it was due to the resource location changing as you suspected. If you head over to http://www.image-net.org/challenges/LSVRC/2012/downloads, the download links are near the bottom of the page. You can copy the base URL from one of the links and replace the BASE_URL variable in inception/data/download_imagenet.sh. Hopefully the URLs in the repo will be updated to fix this.

AlexanderMelde commented 3 years ago

@haydengunraj thanks a lot for your response! I did as proposed and copied the cryptic looking string to my code. Unfortunately, two days later, that string (used as a replacement for nonpub) changed.

I wrote a little script for jupyter notebooks that does the trick for me:

import requests
t = requests.get("http://www.image-net.org/challenges/LSVRC/2012/downloads", allow_redirects=True).text
t = t.split("/ILSVRC2012_devkit_t12.tar.gz")[0].split('"')[-1]
!sed -i 's/nonpub/{t}/g' tf-models/inception/inception/data/download_imagenet.sh 

This parses the linked website for download tokens and then replaces the relevant parts of the URLs in my download script.