mlcommons / inference_results_v0.5

This repository contains the results and code for the MLPerf™ Inference v0.5 benchmark.
https://mlcommons.org/en/inference-datacenter-05/
Apache License 2.0
55 stars 43 forks source link

Cannot build docker for v05/closed/NVIDIA #43

Open Nitinmlp opened 3 years ago

Nitinmlp commented 3 years ago

Have references to Python2 and python2.7 which runs into multiple issues.

Once I resolve that, I ran into famous issue of cannot find cublas_v2.h file although the file exists in 4 different locations on my system. Even giving absolute path does not work and build_docker is failing.

I modified INCPATH, CUDAPATH etc.. and it did not work. I even gave absolute path of the file in the concerned header file and still it does not work. any ideas what could be causing such issue?

and why this particular file. I f google, I see many in dev community complaining about this issue in different compilation contexts. $find /usr/local -name "cublas.h" /usr/local/cuda-8.0/include/cublas.h /usr/local/cuda-10.0/include/cublas.h /usr/local/cuda-10.1/targets/x86_64-linux/include/cublas.h

$find /usr/include -name "cublas.h" /usr/include/cublas.h

In /plugin/decoderPlugin.h

ifndef GNMT_DECODER_PLUGIN_H

define GNMT_DECODER_PLUGIN_H

include

include "NvInferPlugin.h"

Compilation failing use absolute path for cublas_v2.h and it still fails

include "/usr/include/cublas_v2.h"

include

Failure Log: In file included from plugin/decoderPlugin.cu:19:0: plugin/decoderPlugin.h:22:36: fatal error: /usr/include/cublas_v2.h: No such file or directory compilation terminated. Makefile:137: recipe for target '../../../../build/bin/GNMT/chobj/plugin/decoderPlugin.o' failed make[2]: [../../../../build/bin/GNMT/chobj/plugin/decoderPlugin.o] Error 1 make[2]: Waiting for unfinished jobs....

nvpohanh commented 3 years ago

@Nitinmlp Which docker image did you use? Did you use this Dockerfile? https://github.com/mlperf/inference_results_v0.5/blob/master/closed/NVIDIA/docker/Dockerfile

Nitinmlp commented 3 years ago

yes I am using this Dockerfile.

https://github.com/mlperf/inference_results_v0.5/blob/master/closed/NVIDIA/docker/Dockerfile

Regards Nitin


From: nvpohanh notifications@github.com Sent: Monday, November 23, 2020 9:01 AM To: mlperf/inference_results_v0.5 inference_results_v0.5@noreply.github.com Cc: Nitinmlp nitinin@outlook.com; Mention mention@noreply.github.com Subject: Re: [mlperf/inference_results_v0.5] Cannot build docker for v05/closed/NVIDIA (#43)

@Nitinmlphttps://github.com/Nitinmlp Which docker image did you use? Did you use this Dockerfile? https://github.com/mlperf/inference_results_v0.5/blob/master/closed/NVIDIA/docker/Dockerfile

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mlperf/inference_results_v0.5/issues/43#issuecomment-731908697, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARERJPO4PWJCD2EOAPSL2FTSRHJQVANCNFSM4T5CKOSA.

nvpohanh commented 3 years ago

@Nitinmlp Could you comment-out this line: https://github.com/mlperf/inference_results_v0.5/blob/master/closed/NVIDIA/docker/Dockerfile#L223 and try make build_docker again?