japaric-archived / linalg.rs

[INACTIVE]
Apache License 2.0
29 stars 1 forks source link

I can't correctly implement the (s)axpy operation #10

Closed tolgap closed 10 years ago

tolgap commented 10 years ago

I'm trying to expand the operations on matrices. I'm in the process of adding addition to Mat<T>. I use BlasAxpy::axpy to do this (even though it's only a level 1 BLAS operation. The LLVM backend will surely optimize a matrix-matrix addition as it's only a vector-vector addition) for the sake of consistency.

But I run into issues with wrong answers.

impl<T> Add<Mat<T>, Mat<T>> for Mat<T> where T: Clone + BlasAxpy + Zero + One {
    fn add(&self, rhs: &Mat<T>) -> Mat<T> {
        assert!(self.nrows() == rhs.nrows() && self.ncols() == rhs.ncols());

        let mut data = self.data.clone();

        let axpy = BlasAxpy::axpy(None::<T>);
        let n = &to_blasint(data.len());
        let alpha = &num::one();
        let x = rhs.data.as_ptr();
        let xinc = &to_blasint(0u); // also tried using num::zero::<T>()
        let y = data.as_mut_ptr();
        let yinc = &to_blasint(0u); // also tried using num::zero::<T>()
        unsafe { axpy(n, alpha, x, xinc, y, yinc) }

        Mat::new(data, self.nrows())
    }
}

When I try to debug the issues, using this simple script, I get strange answers:

fn main() {
    let a = mat![
        1f32, 2.0, 7.0;
        3.0, 4.0, 8.0;
    ];

    let b = mat![
        3f32, 4.0, 9.0;
        5.0, 6.0, 10.0;
    ];

    println!("{}", a + b)

}

Output is:

[19, 2, 7]
[3, 4, 8]

It only changes the first element (to some random? number). What am I doing wrong?

japaric commented 10 years ago

xinc and yinc must be set to one.

It only changes the first element (to some random? number). What am I doing wrong?

It's doing what you asked it to do, with xinc = yinc = 0 and n = 6, it's calling *y += alpha * *x six times.

tolgap commented 10 years ago

@japaric that fixed it, thank you. I thought xinc and yinc were supposed to increment all x and y elements by some value, hence I set it to 0.