Open blue-orc opened 4 years ago
Also, directory structure for minigo bucket has changed: https://console.cloud.google.com/storage/browser/minigo-pub/ml_perf/?pli=1
The provided configuration expects that the checkpoint can be found at ml_perf/checkpoint/9, whereas /ml_perf/0.6/checkpoint seems to be the correct location.
I second this issue. I tried to run maskrcnn implementation, but I couldn't get the weights file from anywhere.
I solved this issue by updating the link for download_weights.sh file:
try using https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl instead of the link in the original script.
I'm trying to run this series of benchmarks under the NVIDIA folder, but running into a lot of issues trying to acquire and set up these datasets properly.
The COCO dataset links here are all broken: https://github.com/mlperf/training_results_v0.6/blob/master/NVIDIA/benchmarks/maskrcnn/implementations/download_dataset.sh
I was able to download the dataset from http://cocodataset.org/ but I'm not sure where to get the weights file.
Also the imagenet dataset for the resnet benchmark is unavailable for direct download. I was able to acquire the dataset, but ran into issues when running the actual training test. The error happened at line 163 of this file: https://github.com/mlperf/training_results_v0.6/blob/master/NVIDIA/benchmarks/resnet/implementations/mxnet/train_imagenet.py#L163
I didn't copy the error but it said that there was an issue with file mapping, my guess is because I don't have it setup exactly how it was supposed to be set up because I've had to try to piece the dataset together.
Is there any updated way to acquire the exact datasets and ensure they are consistent with the published run results?