KonduitAI / deeplearning4j

Eclipse Deeplearning4j, ND4J, DataVec and more - deep learning & linear algebra for Java/Scala with GPUs + Spark
http://deeplearning4j.konduit.ai
Apache License 2.0
11 stars 7 forks source link

Add support for CUDA 11.1 #552

Closed saudet closed 3 years ago

saudet commented 3 years ago

Before we can merge this, libnd4j needs to be updated. Currently with CUDA 11.1, we get compiler errors like this:

/home/saudet/projects/konduit/deeplearning4j/libnd4j/include/array/NDArray.hXX: In instantiation of ‘void sd::NDArray::assign(const T&, bool) [with T = double; <template-parameter-1-2> = void]’:
/home/saudet/projects/konduit/deeplearning4j/libnd4j/include/array/NDArray.hXX:1225:74:   required from here
/home/saudet/projects/konduit/deeplearning4j/libnd4j/include/array/NDArray.hXX:1221:37: error: cannot convert ‘sd::NDArray*’ to ‘const std::vector<const sd::NDArray*>&’
 1221 |     NDArray::prepareSpecialUse({this}, {&temp});
      |                                     ^~~~
      |                                     |
      |                                     sd::NDArray*
quickwritereader commented 3 years ago

just checked it. an explicit call should be fine there. there were no other errors.

diff --git a/libnd4j/include/array/NDArray.hXX b/libnd4j/include/array/NDArray.hXX
index cfd9103430..e9c3bc1042 100644
--- a/libnd4j/include/array/NDArray.hXX
+++ b/libnd4j/include/array/NDArray.hXX
@@ -1218,9 +1218,9 @@ void NDArray::assign(const T& value, bool allowParallelism) {
     // just fire scalar
     auto temp = NDArrayFactory::create(dataType(), value, this->getContext());

-    NDArray::prepareSpecialUse({this}, {&temp});
+    NDArray::prepareSpecialUse(std::vector<const NDArray*>{this}, std::vector<const NDArray*>{&temp});
     NativeOpExecutioner::execScalar(getContext(), sd::scalar::CopyPws, buffer(), shapeInfo(), specialBuffer(), specialShapeInfo(), buffer(), shapeInfo(), specialBuffer(), specialShapeInfo(), temp.buffer(), temp.shapeInfo(), temp.specialBuffer(), temp.specialShapeInfo(), nullptr, allowParallelism);
-    NDArray::registerSpecialUse({this}, {&temp});
+    NDArray::registerSpecialUse(std::vector<const NDArray*>{this}, std::vector<const NDArray*>{&temp});
 }
 template ND4J_EXPORT void NDArray::assign(const double& value, bool allowParallelism);
 template ND4J_EXPORT void NDArray::assign(const float& value, bool allowParallelism);
saudet commented 3 years ago

@quickwritereader Please update my branch!

saudet commented 3 years ago

@quickwritereader If this works for you, please merge! Thanks