tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.25k stars 1.92k forks source link

Finish placeholder features: einsum gradients, conv1d causal padding #8277

Open admidelu opened 1 month ago

admidelu commented 1 month ago

Describe the feature and the current behavior/state.

  1. cannot train layers using tf.einsum operations "Error: Cannot compute gradient: gradient function not found for Einsum"

  2. conv1d with causal padding "The support for CAUSAL padding mode in conv1dWithBias is not implemented yet."

These feature requests haven't been addressed since 2018, but if this counts towards anything they would be highly appreciated.

gaikwadrahul8 commented 1 month ago

Hi, @admidelu

I apologize for the delayed response and thank you for bringing this issue to our attention, if possible could you please help us with your Github repo or code snippet and complete steps to replicate the same behavior from our end to investigate this issue further from our end ?

Thank you from your cooperation and patience.

admidelu commented 1 month ago

Hey, will these work:

Einsum in custom layer

// Define a custom layer using tf.einsum
class EinsumLayer extends tf.layers.Layer {
  constructor() {
    super({});
  }

  call(inputs) {
    // Example: Matrix multiplication using einsum
    // Input shape: [batchSize, m, k]
    // Weight shape: [k, n]
    // Output shape: [batchSize, m, n]
    const weight = tf.randomNormal([inputs[0].shape[2], 64]);
    return tf.einsum('bij,jk->bik', inputs[0], weight);
  }

  computeOutputShape(inputShape) {
    return [inputShape[0], inputShape[1], 64];
  }

  static get className() {
    return "EinsumLayer";
  }
}

const inputShape = [10, 32];  // 10 timesteps, 32 features

const model = tf.sequential();
model.add(tf.layers.dense({inputShape: inputShape, units: 32, activation: 'relu'}));
model.add(new EinsumLayer());

model.compile({
  optimizer: 'adam',
  loss: 'meanSquaredError'
});

const input = tf.randomNormal([1, ...inputShape]);

// inference works
const output = model.predict(input);
output.print();

// training fails
const target = tf.randomNormal([1, 10, 64]);

model.fit(input, target, {
  epochs: 1,
  batchSize: 1
});

https://codepen.io/admidelu/pen/GRaZxdX?editors=0012&layout=top

conv1d with causal padding

const inputShape = [10, 1];  // 10 timesteps, 1 feature

const model = tf.sequential();
model.add(tf.layers.conv1d({
  inputShape: inputShape,
  filters: 8,
  kernelSize: 3,
  padding: 'causal',  // causal padding
}));

model.compile({
  optimizer: 'adam',
  loss: 'meanSquaredError'
});

const input = tf.randomNormal([1, ...inputShape]);

const output = model.predict(input);

input.print();
output.print();

https://codepen.io/admidelu/pen/xxNVWLg?editors=0012&layout=top

gaikwadrahul8 commented 1 month ago

Hi, @admidelu

Thank you for providing the CodePen examples for replication. I attempted to replicate the behavior using both examples, but encountered an error message indicating that training layers using tf.einsum operations is not currently supported. The error message is Error: Cannot compute gradient: gradient function not found for Einsum. Additionally, it seems like casual padding mode in tf.layers.conv1d is not yet implemented. However, the same and valid padding options function as expected.

I will discuss this issue in our upcoming internal meeting and provide an update on the same soon.Thank you for bringing this to our attention, I really appreciate your valuable time and efforts on this.

Thank you for your cooperation and patience.