baidu-research / DeepBench

Benchmarking Deep Learning operations on different hardware
Apache License 2.0
1.07k stars 239 forks source link

Make DeepBench compile cleanly with provided libs on RHEL #81

Closed frival closed 6 years ago

frival commented 6 years ago

The DeepBench makefiles assume certain libs and header files (e.g. libmpi) are located in the same directory tree. For reasons I have yet to discern RHEL installs the libraries in one directory structure and the headers in another (in the -devel packages). By applying the below patch and adding "MPI_INCLUDE_PATH=/usr/include/openmpi-x86_64" DeepBench now builds cleanly on RHEL; I've coded the change such that it shouldn't require any changes for systems that have libraries and header files in the same directory.


diff -ur DeepBench.orig/code/baidu_allreduce/Makefile DeepBench/code/baidu_allreduce/Makefile --- DeepBench.orig/code/baidu_allreduce/Makefile 2017-12-14 14:53:03.255428367 -0500 +++ DeepBench/code/baidu_allreduce/Makefile 2017-12-12 12:15:19.396655494 -0500 @@ -6,6 +6,7 @@ CUDA_PATH?=/usr/local/cuda CUDA_LIB64=$(CUDA_PATH)/lib64 MPI_PATH?=/usr/local/openmpi +MPI_INCLUDE_PATH?=/usr/local/openmpi BAIDU_ALLREDUCE_PATH?=/local/baidu-allreduce BIN_DIR?=bin MKDIR=mkdir -p @@ -21,8 +22,8 @@

ring_all_reduce: $(MKDIR) $(BIN_DIR)

sharannarang commented 6 years ago

@mpatwary , can you take a look at this one?

mpatwary commented 6 years ago

Thanks for bringing this to our attention. For my clarification, does RHEL install the openmpi lib and headers in different directories by default or this is happening to a specific version?

frival commented 6 years ago

I just checked the RHEL 6, RHEL 7, and Fedora 27 openmpi and openmpi-devel packages and they do by default install libraries and headers in different directories. If someone wants to download, build, and install openmpi by hand I would imagine they would get libs and headers in the same directory tree, but that's not how it would happen by default in RHEL/Fedora. I'm assuming the same is the case for CentOS given that it's derived from RHEL packages.

mpatwary commented 6 years ago

Thanks @frival for the clarification. Could you please create a PR we could merge?

frival commented 6 years ago

I've tweaked the suggestion above in the PR so it will default to MPI_PATH so folks who aren't burdened with this split shouldn't have to change anything in how they build. Would it make sense to also update the build instructions in README.md to refer to this variable if this approach is deemed acceptable?

mpatwary commented 6 years ago

I think mentioning that in the README.md is good idea, please go ahead. Thanks.