raskr / rust-autograd

Tensors and differentiable operations (like TensorFlow) in Rust
MIT License
484 stars 37 forks source link

Alternatives for `tf.where()` #45

Closed laocaoshilaocao closed 2 years ago

laocaoshilaocao commented 3 years ago

Hi, i am working on a simple kmeans clustering algorithm using autogard I want to achieve the same result as this TF code:

assignments = tf.constant([1, 0, 0])
c = 1
tf.where(tf.equal(assignments, c))

It simply returns the indices where the condition is True. I wanna ask do you have any idea of simple alternative method in autogard? Or i should again make my own custom defined op?

raskr commented 3 years ago

simple kmeans clustering algorithm using autogard

Cool, would you add that in the example dir if you can afford?

I wanna ask do you have any idea of simple alternative method in autogard? Or i should again make my own custom defined op?

Yes, probably you have to impl a custom op...

laocaoshilaocao commented 3 years ago

Cool, would you add that in the example dir if you can afford?

Yes i can do that after i finish. It is still a very "ugly" code :)

Yes, probably you have to impl a custom op...

I am trying to do that using exiting graph methods, which is more practical i believe.

laocaoshilaocao commented 3 years ago

One related question is how could i update the variable usingautograd? The situation is like:

ag::with(|g| {
let centroids_value: Arc<RwLock<ag::NdArray<f64>>> = into_shared(rng.glorot_uniform(&[xxx]));
let mut centroids = g.variable(centroids_value.clone()); 

//iteration epoch
for epoch in 0..2 {
    ag::with(|g| {
      let points: ag::Tensor<f64> = g.constant(array![[xxx]]);

      //update all centroids
      for iter in 0.. xx {
      (calculate the new centroid by mean-calculation)
      centroids = g.concat(&[mean_iter, centroids],  0); 
      }

      g.eval(&[centroids] , &[]);
  });
}
});

The error is like g is a reference that is only valid in the closure body, g escapes the closure body here when i do the value re-assignment for centroids in the inner g. I think the related method in TF is assign, but i just don't find similar operation in autograd. My TF version is like this:

points = tf.constant(xxx)
centroids = tf.Variable(yyy)

means = []

for c in range(clusters_n):
(mean calculation)

new_centroids = tf.concat(means, 0)
update_centroids = tf.assign(centroids, new_centroids)

with tf.Session() as sess:
    sess.run(init)
    for step in range(iteration_n):
    [_, centroid_values, points_values, assignment_values] = sess.run([update_centroids, centroids, points, assignments])
raskr commented 3 years ago

if the epoch number is small, you can just swap the with and for. (but the large num of epochs would cause graph size overflow) The preferred way is defining the Op using ndarray's assign. It would be great if you send a PR about this!

raskr commented 3 years ago

SGD's impl may be useful for Assign.

laocaoshilaocao commented 3 years ago

swap the with and for

What do you mean by swap with and for? I don't find theag::for is a correct expression.

SGD's impl may be useful for Assign.

I am going to have a check of that.

laocaoshilaocao commented 3 years ago

Hi, after i finish the development, i found myself always met an error: thread 'main' panicked at 'Bad op impl of network::utils::Assign: cannot perform mutable borrowing for input(0)'

My assign code is as followed: struct Assign;

impl ag::op::Op<f64> for Assign {
    fn compute(
        &self,
        ctx: &mut ag::op::ComputeContext<f64>,
    ) {
        ctx.input_mut(0).assign(&ctx.input(1));
        ctx.append_empty_output();
    }

    fn grad(&self, ctx: &mut ag::op::GradientContext<f64>) {
        ctx.append_input_grad(None);
    }
}

pub fn assign<'graph>(x: &Tensor<'graph>,  y: &Tensor<'graph>, g: &'graph ag::Graph<f64>)
-> Tensor<'graph> {
    ag::Tensor::builder()
           .set_inputs(&[Input::new_mut(x), Input::new(y)])
           .build(g, Assign)
}

Do u have any idea what is the problem of this one?

My situation is like:

let a = tensor after adam optimizing
doing kmeans and assign value for that a;
laocaoshilaocao commented 3 years ago

I also suffer from the NaN problem a lot. I met situation like this:

let mean = (means / assignments_num);

where assignments_num is:[[0.0]] shape=[1, 1], strides=[1, 1] means is: [[0.0, 0.0]] shape=[1, 2]

I always got result mean are: [[NaN, NaN]]

Do u have any idea about this situation?

raskr commented 3 years ago

Sorry for the late reply.

cannot perform mutable borrowing for input(0)

Are you declaring x as variable using g.variable?

I also suffer from the NaN problem a lot.

That's because ndarray doesn't check zero-div. Is it difficult to avoid zero-div?

laocaoshilaocao commented 3 years ago

Are you declaring x as variable using ag::variable?

yes i did

That's because ndarray doesn't check zero-div. Is it difficult to avoid zero-div?

It is a little bit hard since that comes from the exp calculation and i am trying to use some Nan to inf function using g.minimum

raskr commented 3 years ago

i am trying to use some Nan to inf function using g.minimum

Cool, it might be worth considering the handling of zero-div in binary-ops

yes i did

Oh really? This example passed:

struct Assign;

impl ag::op::Op<f64> for Assign {
    fn compute(
        &self,
        ctx: &mut ag::op::ComputeContext<f64>,
    ) {
        ctx.input_mut(0).assign(&ctx.input(1));
        ctx.append_empty_output();
    }

    fn grad(&self, ctx: &mut ag::op::GradientContext<f64>) {
        ctx.append_input_grad(None);
    }
}

type Tensor<'a> = ag::Tensor<'a, f64>;

pub fn assign<'graph>(x: &Tensor<'graph>,  y: &Tensor<'graph>, g: &'graph ag::Graph<f64>)
                      -> Tensor<'graph> {
    ag::Tensor::builder()
        .set_inputs(&[ag::tensor::Input::new_mut(x), ag::tensor::Input::new(y)])
        .build(g, Assign)
}

use self::ag::NdArray;
use ag::{tensor::Constant, tensor::Variable, with};
use ndarray::array;

#[test]
fn test() {
    ag::with(|g| {
        let a = g.variable(ndarray::arr1(&[1., 2., 3.]));
        let b = g.zeros(&[3]);
        let c = assign(&a, &b, g);
        c.eval(&[]);

        println!("{:?}", a.get_variable_array());
    });
}

Result:

Some(RwLock { data: [0.0, 0.0, 0.0] shape=[3], strides=[1], layout=C | F (0x3), dynamic ndim=1 })
raskr commented 2 years ago

You can do where using the new Tensor::map method