arrayfire / arrayfire-rust

Rust wrapper for ArrayFire
BSD 3-Clause "New" or "Revised" License
815 stars 58 forks source link

[BUG] Segfault error message when matmul function is used. #243

Closed BA8F0D39 closed 4 years ago

BA8F0D39 commented 4 years ago

Description

Segfault error message when matmul function is used.

Reproducible Code and/or Steps

       let n:u64 = 1000 ;

    let a_dims = arrayfire::Dim4::new(&[n,n,1,1]);
    let a = arrayfire::randn::<f32>(a_dims);

        let b_dims = arrayfire::Dim4::new(&[n,n,1,1]);
    let b = arrayfire::randn::<f32>(b_dims);

    let c =  arrayfire::matmul(&a, &b, arrayfire::MatProp::NONE, arrayfire::MatProp::NONE) ;

System Information

ArrayFire v3.7.0 (CUDA, 64-bit Linux, build fbea2ae) Platform: CUDA Runtime 10.1, Driver: 440.82 [0] TITAN Xp, 12195 MB, CUDA Compute 6.1 Arrayfire version: (3, 7, 0) Name: TITAN_Xp Platform: CUDA Toolkit: v10.1 Compute: 6.1 Revision: fbea2ae

9prady9 commented 4 years ago

@BA8F0D39 Does the output change if you run your program with environment variable AF_PRINT_ERRORS set to 1.

You can it using export AF_PRINT_ERRORS=1

Update: I was able to reproduce this on current master of arrayfire. Please also try 3.7.2 fix release.

BA8F0D39 commented 4 years ago

@9prady9 Why is it throwing a segment fault?

My machine crashed 10 hours after running the code.

Edit: Resolved in 3.7.2

9prady9 commented 4 years ago

I am not sure if even the crash and code-run are related if those two events are separated by such huge time difference. How did you conclude crash was caused by this particular run from 10 hours ago ?

BA8F0D39 commented 4 years ago

It was the only program running on the machine. I left it overnight, when I went to sleep.