arrayfire / arrayfire-rust

Rust wrapper for ArrayFire
BSD 3-Clause "New" or "Revised" License
815 stars 58 forks source link

Incorrect output of `print` and `af_print` under osx/opencl #83

Closed horasal closed 7 years ago

horasal commented 8 years ago

In my mac pro (with ati radoen hd 5770), both print function and af_print macro prints incorrect values.

code:

#[macro_use(af_print)]
extern crate arrayfire as af;

use af::*;

fn main() {
    let available = get_available_backends();
    // Select CUDA - OPENCL - CPU
    if available.contains(&Backend::CUDA) {
       set_backend(Backend::CUDA);
       println!("There are {} CUDA compute devices", device_count());
    } else if available.contains(&Backend::OPENCL) {
       set_backend(Backend::OPENCL);
       println!("There are {} OPENCL compute devices", device_count());
    } else if available.contains(&Backend::CPU) {
       println!("Evaluating CPU Backend...");
       set_backend(Backend::CPU);
       println!("There are {} CPU compute devices", device_count());
    }
    info();
    let dims = Dim4::new(&[10,10,1,1]);
    let a = randu::<f32>(dims);
    let b = randu::<f32>(dims);
    // print
    print(&a);
    print(&b);
    // macro print
    af_print!("", a);
    af_print!("", b);

    // firstly copy array to a vec, then print. (only this works)
    let mut buf = Vec::<f32>::new();
    buf.resize(100, 0f32);
    a.host(&mut buf);
    println!("{:?}", buf);

}

Output: (expect random values between [0,1], got nan.)

$ cargo run
     Running `target/debug/af`
There are 1 OPENCL compute devices
ArrayFire v3.3.2 (OpenCL, 64-bit Mac OSX, build f65dd97)
[0] APPLE   : ATI Radeon HD 5770, 1024 MB
No Name Array
[10 10 1 1]
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 

No Name Array
[10 10 1 1]
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 

No Name Array
[10 10 1 1]
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 

No Name Array
[10 10 1 1]
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 
       nan        nan        nan        nan        nan        nan        nan        nan        nan        nan 

[0.41073772, 0.8223705, 0.95179963, 0.17936543, 0.41982353, 0.008073494, 0.37754178, 0.3026608, 0.64556795, 0.55907905, 0.6600012, 0.0764177, 0.09007972, 0.593275, 0.10984313, 0.104641184, 0.8826711, 0.16470726, 0.8059861, 0.5937568, 0.839482, 0.19334152, 0.7270495, 0.03218992, 0.0011899231, 0.87032217, 0.5259499, 0.14427514, 0.32533863, 0.5081013, 0.92498755, 0.30630583, 0.9312504, 0.86840326, 0.6591691, 0.43871564, 0.37842128, 0.40023077, 0.43901545, 0.47176215, 0.6529656, 0.54763544, 0.85767967, 0.836985, 0.061794102, 0.42239332, 0.529269, 0.021239838, 0.11028498, 0.44200444, 0.8355114, 0.48783612, 0.2055233, 0.17935656, 0.56057334, 0.67667997, 0.67417336, 0.452302, 0.12359583, 0.79240143, 0.103285834, 0.21190202, 0.5955207, 0.37446535, 0.91645145, 0.9425949, 0.48174524, 0.9097086, 0.6821017, 0.605556, 0.9275532, 0.8661627, 0.35776076, 0.6262612, 0.9746776, 0.20015794, 0.20222169, 0.5388589, 0.55094266, 0.81801903, 0.6742259, 0.97126704, 0.9877892, 0.06626553, 0.6042976, 0.59000194, 0.7481744, 0.23789643, 0.31214234, 0.89567167, 0.28086075, 0.30839512, 0.19651483, 0.14590272, 0.25506493, 0.5745967, 0.36144972, 0.4470784, 0.083551034, 0.7877617]
horasal commented 8 years ago

Of course, CPU backend works.

There are 1 CPU compute devices
ArrayFire v3.3.2 (CPU, 64-bit Mac OSX, build f65dd97)
[0] Unknown: Unknown, 65536 MB, Max threads(1) 
No Name Array
[10 10 1 1]
    0.5488     0.6459     0.7917     0.0871     0.9786     0.6399     0.2646     0.6176     0.3595     0.6706 
    0.5928     0.3844     0.8122     0.6482     0.4736     0.5820     0.1863     0.1497     0.6131     0.1709 
    0.7152     0.4376     0.5289     0.0202     0.7992     0.1434     0.7742     0.6121     0.4370     0.2104 
    0.8443     0.2975     0.4800     0.3682     0.8009     0.5374     0.7369     0.2223     0.9023     0.3582 
    0.6028     0.8918     0.5680     0.8326     0.4615     0.9447     0.4562     0.6169     0.6976     0.1289 
    0.8579     0.0567     0.3928     0.9572     0.5205     0.7586     0.2166     0.3865     0.0993     0.7507 
    0.5449     0.9637     0.9256     0.7782     0.7805     0.5218     0.5684     0.9437     0.0602     0.3154 
    0.8473     0.2727     0.8361     0.1404     0.6789     0.1059     0.1352     0.9026     0.9698     0.6078 
    0.4237     0.3834     0.0710     0.8700     0.1183     0.4147     0.0188     0.6818     0.6668     0.3637 
    0.6236     0.4777     0.3374     0.8701     0.7206     0.4736     0.3241     0.4499     0.6531     0.3250 

No Name Array
[10 10 1 1]
    0.5702     0.1613     0.1590     0.3687     0.9765     0.0392     0.3180     0.2654     0.3186     0.1832 
    0.0384     0.9953     0.3380     0.6625     0.8782     0.4417     0.8805     0.5090     0.0094     0.3978 
    0.4386     0.6531     0.1104     0.8210     0.4687     0.2828     0.4143     0.5232     0.6674     0.5865 
    0.6343     0.5819     0.6748     0.0136     0.5096     0.9796     0.9182     0.9167     0.8423     0.5528 
    0.9884     0.2533     0.6563     0.0971     0.9768     0.1202     0.0641     0.0939     0.1318     0.0201 
    0.9589     0.4144     0.3172     0.6228     0.0557     0.3594     0.2168     0.9212     0.6472     0.1649 
    0.1020     0.4663     0.1382     0.8379     0.6048     0.2961     0.6925     0.5759     0.7163     0.8289 
    0.6528     0.4747     0.7783     0.6737     0.4512     0.4809     0.5652     0.0831     0.8414     0.3698 
    0.2089     0.2444     0.1966     0.0961     0.7393     0.1187     0.5666     0.9293     0.2894     0.0047 
    0.6351     0.6235     0.9496     0.9719     0.0200     0.6887     0.8651     0.2777     0.2647     0.1464 

No Name Array
[10 10 1 1]
    0.5488     0.6459     0.7917     0.0871     0.9786     0.6399     0.2646     0.6176     0.3595     0.6706 
    0.5928     0.3844     0.8122     0.6482     0.4736     0.5820     0.1863     0.1497     0.6131     0.1709 
    0.7152     0.4376     0.5289     0.0202     0.7992     0.1434     0.7742     0.6121     0.4370     0.2104 
    0.8443     0.2975     0.4800     0.3682     0.8009     0.5374     0.7369     0.2223     0.9023     0.3582 
    0.6028     0.8918     0.5680     0.8326     0.4615     0.9447     0.4562     0.6169     0.6976     0.1289 
    0.8579     0.0567     0.3928     0.9572     0.5205     0.7586     0.2166     0.3865     0.0993     0.7507 
    0.5449     0.9637     0.9256     0.7782     0.7805     0.5218     0.5684     0.9437     0.0602     0.3154 
    0.8473     0.2727     0.8361     0.1404     0.6789     0.1059     0.1352     0.9026     0.9698     0.6078 
    0.4237     0.3834     0.0710     0.8700     0.1183     0.4147     0.0188     0.6818     0.6668     0.3637 
    0.6236     0.4777     0.3374     0.8701     0.7206     0.4736     0.3241     0.4499     0.6531     0.3250 

No Name Array
[10 10 1 1]
    0.5702     0.1613     0.1590     0.3687     0.9765     0.0392     0.3180     0.2654     0.3186     0.1832 
    0.0384     0.9953     0.3380     0.6625     0.8782     0.4417     0.8805     0.5090     0.0094     0.3978 
    0.4386     0.6531     0.1104     0.8210     0.4687     0.2828     0.4143     0.5232     0.6674     0.5865 
    0.6343     0.5819     0.6748     0.0136     0.5096     0.9796     0.9182     0.9167     0.8423     0.5528 
    0.9884     0.2533     0.6563     0.0971     0.9768     0.1202     0.0641     0.0939     0.1318     0.0201 
    0.9589     0.4144     0.3172     0.6228     0.0557     0.3594     0.2168     0.9212     0.6472     0.1649 
    0.1020     0.4663     0.1382     0.8379     0.6048     0.2961     0.6925     0.5759     0.7163     0.8289 
    0.6528     0.4747     0.7783     0.6737     0.4512     0.4809     0.5652     0.0831     0.8414     0.3698 
    0.2089     0.2444     0.1966     0.0961     0.7393     0.1187     0.5666     0.9293     0.2894     0.0047 
    0.6351     0.6235     0.9496     0.9719     0.0200     0.6887     0.8651     0.2777     0.2647     0.1464 

[0.5488135, 0.5928446, 0.71518934, 0.84426576, 0.60276335, 0.8579456, 0.5448832, 0.8472517, 0.4236548, 0.6235637, 0.6458941, 0.3843817, 0.4375872, 0.2975346, 0.891773, 0.056712978, 0.96366274, 0.2726563, 0.3834415, 0.47766513, 0.79172504, 0.8121687, 0.5288949, 0.47997716, 0.56804454, 0.3927848, 0.92559665, 0.83607876, 0.071036056, 0.33739617, 0.087129295, 0.6481719, 0.020218398, 0.36824155, 0.83261985, 0.95715517, 0.77815676, 0.14035077, 0.87001216, 0.87008727, 0.9786183, 0.47360805, 0.7991586, 0.8009108, 0.46147937, 0.5204775, 0.7805292, 0.67887956, 0.11827442, 0.7206327, 0.639921, 0.5820198, 0.14335328, 0.53737324, 0.9446689, 0.7586156, 0.5218483, 0.105907604, 0.41466194, 0.47360042, 0.2645556, 0.18633235, 0.7742337, 0.73691815, 0.45615032, 0.21655035, 0.56843394, 0.13521817, 0.018789798, 0.324141, 0.6176355, 0.14967486, 0.6120957, 0.22232139, 0.616934, 0.38648897, 0.94374806, 0.9025985, 0.6818203, 0.44994998, 0.3595079, 0.61306345, 0.43703195, 0.9023486, 0.6976312, 0.09928035, 0.060225468, 0.96980906, 0.6667667, 0.65314, 0.67063785, 0.17090958, 0.21038257, 0.35815218, 0.12892629, 0.75068617, 0.31542835, 0.60783064, 0.36371076, 0.32504722]
9prady9 commented 8 years ago

@zhaihj We will look into it and update here as soon as we can.

horasal commented 8 years ago

I installed arrayfire through homebrew:

$ brew info arrayfire
homebrew/science/arrayfire: stable 3.3.2 (bottled)
General purpose GPU library
http://arrayfire.com
/usr/local/Cellar/arrayfire/3.3.2 (297 files, 90.5M)
  Poured from bottle on 2016-09-05 at 14:30:29
From: https://github.com/Homebrew/homebrew-science/blob/master/arrayfire.rb
==> Dependencies
Build: cmake ✔, boost ✔, boost-compute ✘, pkg-config ✔
Required: freeimage ✔, fftw ✔, clblas ✔, clfft ✔, fontconfig ✔, homebrew/versions/glfw3 ✔

I'm not familiar with OpenCL, but the following code says that my machine supports OpenCL1.2. (and apple also says that it support opencl 1.2 (https://support.apple.com/en-us/HT202823)

// build and run this gives:
// Device Intel(R) Xeon(R) CPU E5645  @ 2.40GHz supports OpenCL 1.2 
// Device ATI Radeon HD 5770 supports OpenCL 1.2 
#include <stdio.h>
#include <stdlib.h>
#include <OpenCL/opencl.h>
int main(int argc, char* const argv[]) {
        cl_uint num_devices, i;
        clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, 0, NULL, &num_devices);
        cl_device_id* devices = calloc(sizeof(cl_device_id), num_devices);
        clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, num_devices, devices, NULL);
        char buf[128];
        for (i = 0; i < num_devices; i++) {
            clGetDeviceInfo(devices[i], CL_DEVICE_NAME, 128, buf, NULL);
            fprintf(stdout, "Device %s supports ", buf);
            clGetDeviceInfo(devices[i], CL_DEVICE_VERSION, 128, buf, NULL);
            fprintf(stdout, "%s\n", buf);
        }

        free(devices);
}

uname -a:

$ uname -a
Darwin Server-Mac-Pro.local 13.4.0 Darwin Kernel Version 13.4.0: Mon Jan 11 18:17:34 PST 2016; root:xnu-2422.115.15~1/RELEASE_X86_64 x86_64
9prady9 commented 8 years ago

@zhaihj Can you please check if you can reproduce the problem in ArrayFire C++ examples. You can compile a simple code stub such as follows using arrayfire c++ library and run it.

#include <arrayfire.h>

using namespace af;

int main(void)
{
    af::info();
    af::array a = af::randu(dim4(10, 10, 1, 1));
    af_print(a);
    return 0;
}

compile command

g++ test.cpp -I$AF_PATH/include -L$AF_PATH/lib -laf

This shall use the unified api library libaf.dylib which should automatically pick up opencl backend since that is your available compute backend (make sure $AF_PATH/lib is added your LD_LIBRARY_PATH & DYLD_LIBRARY_PATH).

This should help us narrow down if the problem is in the rust wrapper or in the arrayfire source.

Thanks, Pradeep.

horasal commented 8 years ago

It seems that the c++ version works well:

C++ version:

ArrayFire v3.3.2 (OpenCL, 64-bit Mac OSX, build f65dd97)
[0] APPLE   : ATI Radeon HD 5770, 1024 MB
a
[10 10 1 1]
    0.7269     0.7104     0.5201     0.3569     0.1437     0.4563     0.3341     0.0899     0.5363     0.1349 
    0.2244     0.0599     0.7576     0.7062     0.9478     0.0178     0.6629     0.9607     0.4004     0.2012 
    0.8294     0.1118     0.4226     0.7078     0.3336     0.1020     0.8054     0.3436     0.7820     0.8084 
    0.4602     0.2877     0.3083     0.3718     0.1538     0.6503     0.5727     0.3438     0.2109     0.0192 
    0.2556     0.2397     0.8256     0.7536     0.3460     0.9052     0.7406     0.8153     0.2797     0.8466 
    0.5561     0.2254     0.2669     0.8718     0.9965     0.1374     0.5217     0.8824     0.5817     0.6515 
    0.4073     0.0772     0.5077     0.4199     0.5602     0.6423     0.5756     0.1823     0.4172     0.7074 
    0.8573     0.7759     0.1810     0.4461     0.1111     0.9277     0.5143     0.1571     0.5894     0.0666 
    0.9413     0.4037     0.3338     0.7561     0.0509     0.3232     0.0764     0.1764     0.3680     0.0314 
    0.1316     0.2974     0.5785     0.7769     0.7075     0.4222     0.4730     0.0388     0.0532     0.7963 

I tried rust version again with cargo run --release, this time it gives strange numbers instead of nans:

Rust version(release):

$ cargo run --release
     Running `target/release/aftest`
There are 1 OPENCL compute devices
ArrayFire v3.3.2 (OpenCL, 64-bit Mac OSX, build f65dd97)
[0] APPLE   : ATI Radeon HD 5770, 1024 MB
No Name Array
[10 10 1 1]
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0001     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0078 -495175986199361592024367104.0000 -162801750156472630968736903471023783936.0000 
    0.0000 2361183241434822606848.0000 -34045175803965864210251117469106176.0000        nan     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 

No Name Array
[10 10 1 1]
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0001     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0078 -495175986199361592024367104.0000 -162801750156472630968736903471023783936.0000 
    0.0000 2361183241434822606848.0000 -34045175803965864210251117469106176.0000        nan     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
    0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000     0.0000 
9prady9 commented 8 years ago

Thank you, that at least narrows it down to rust wrapper. I will investigate if there are any potential pitfalls in rust wrapper while copying data to the host side vector.

9prady9 commented 8 years ago

As it happens, af_print macro just wraps arrafire::print function which in turn just calls the ArrayFire native function af_print_array (FFI call). So there isn't rust related code in this portion other than syntatic sugar for print function.

May be you can do the following

println!("ArrayFire native handle is {:?}", a.get());

and see if the Array native handles are valid.

Unfortunately, we don't have this card in-house and we are not able to reproduce it on our end. So, i only can suggest code snippets for you to debug the problem on your end.

9prady9 commented 8 years ago

@zhaihj Did you get a chance to try out the above single line snippet ?

You may also change the data generator function to see if randu is the culprit. You can try the following instead of randu.

let a = arrayfire::constant::<f32>(1 as f32, dims);
9prady9 commented 8 years ago

@zhaihj Are you still facing this issue ?

horasal commented 8 years ago

Recently I'm mainly working on another PC. I will try the new version of arrayfire on mac pro if got time.

9prady9 commented 7 years ago

@zhaihj If this is not an issue anymore, kindly close the issue.