Closed pseudotensor closed 6 years ago
Hi @mdymczyk , when you converted to cmake/centos, the USEPARALLEL build parameter was lost, so glm was always using 1 GPU and 1 thread for OpenMP. I know @tomkraljevic had problems with this for power and/or centos, but this loses a major functionality.
I changed things so x86 works in ubuntu now, but centos complains about pragma's and openmp as seen for the x86 centos build. How can this be fixed?
Scanning dependencies of target commonh2o4gpu
[ 2%] Building CXX object CMakeFiles/commonh2o4gpu.dir/src/common/elastic_net_ptr.cpp.o
[ 5%] Building CXX object CMakeFiles/commonh2o4gpu.dir/src/common/logger.cpp.o
[ 10%] Building CXX object CMakeFiles/commonh2o4gpu.dir/src/common/utils.cpp.o
[ 10%] Building CXX object CMakeFiles/commonh2o4gpu.dir/src/interface_c/h2o4gpu_c.cpp.o
Scanning dependencies of target ch2o4gpu_cpu_swig_compilation
Scanning dependencies of target ch2o4gpu_gpu_swig_compilation
[ 13%] Swig source
[ 16%] Swig source
/root/repo/src/common/elastic_net_ptr.cpp: In function ‘double h2o4gpu::ElasticNetptr_fit(char, int, int, int, int, int, int, int, char, size_t, size_t, size_t, int, int, double, double, int, int, int, double, double, T*, T*, double, double, int, int, double, int, int, T*, T*, T*, T*, T*, int, T**, T**, T**, T**, size_t*, size_t*, size_t*)’:
/root/repo/src/common/elastic_net_ptr.cpp:569:22: error: expected ‘#pragma omp’ clause before ‘proc_bind’
#pragma omp parallel proc_bind(master)
^
/root/repo/src/common/elastic_net_ptr.cpp: In function ‘double h2o4gpu::ElasticNetptr_predict(char, int, int, int, int, int, int, int, char, size_t, size_t, size_t, int, int, double, double, int, int, int, double, double, T*, T*, double, double, int, int, double, int, int, T*, T*, T*, T*, T*, int, T**, T**, T**, T**, size_t*, size_t*, size_t*)’:
/root/repo/src/common/elastic_net_ptr.cpp:1507:22: error: expected ‘#pragma omp’ clause before ‘proc_bind’
#pragma omp parallel proc_bind(master)
^
make[3]: *** [CMakeFiles/commonh2o4gpu.dir/src/common/elastic_net_ptr.cpp.o] Error 1
Not sure if related. We do use ccache.
Also, the current h2o4gpu makes completely wrong predictions and actuals for that mapd notebook. It should be income earned on order 20000+ but actuals are order 0.2-0.8 and predictions are all over the place within 0-1.0.
So probably the gpu pointer memory stuff not working. Will see if older h2o4gpu works or make better unit test to compare non-ptr to ptr method results.
Regarding:
/root/repo/src/common/elastic_net_ptr.cpp:1507:22: error: expected ‘#pragma omp’ clause before ‘proc_bind’
#pragma omp parallel proc_bind(master)
This is because proc_bind
requires openmp >=4.0 (or 4.0.1 don't remember) which comes bundled with gcc >= 4.9 and I think the one we use in the docker is 4.8.x. We do install libgomp
explicitly but not sure which version it installs on power machines. Logs for XGBoost show that its cmakelists finds 18:27:30 -- Found OpenMP: TRUE (found version "3.1")
3.1 by default.
So we need to check if it's because there are multiple versions on that system (original one + one we installed) or whether it's because yum install libgomp
installs an old version and we need to compile a newer one by hand.
Maybe setting a required version in our CMakeLists will be enough.