At first thank you for the opportunity to handle convex optimization problems via backpropagation in Tensorflow and Pytorch.
It is a nice opportunity for combining machine and deep learning.
For my first tests with Cvxpylayers I took the Tensorflow example and modified it a bit. I constructed a neural network sequentially before the cvypylayer in forward pass (in the backward pass it is sequentially after the cvxpylayer). Unfortunately, I got the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute MatMul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:MatMul]
So, somehow the backpropagation via the cvxpylayer outputs a double tensor (tf.float64) which is not matching with the MatMul operation, processing only float tensors (tf.float32).
I simplified the problem in the following code:
import cvxpy as cp
import tensorflow as tf
from cvxpylayers.tensorflow import CvxpyLayer
n, m = 2, 3
x = cp.Variable(n)
A = cp.Parameter((m, n))
b = cp.Parameter(m)
constraints = [x >= 0]
objective = cp.Minimize(0.5 * cp.pnorm(A @ x - b, p=1))
problem = cp.Problem(objective, constraints)
assert problem.is_dpp()
cvxpylayer = CvxpyLayer(problem, parameters=[A, b], variables=[x])
A_tf = tf.Variable(tf.random.normal((m, n)))
b_tf = tf.Variable(tf.random.normal((m,)))
with tf.GradientTape() as tape:
# solve the problem, setting the values of A, b to A_tf, b_tf
b_iden = tf.squeeze(tf.matmul(tf.eye(3, 3), tf.expand_dims(b_tf, axis=1)))
# b_iden is the dot product between the b_tf vector [3,] and the identity matrix [3, 3]
solution, = cvxpylayer(A_tf, tf.cast(b_iden, dtype=tf.float32))
summed_solution = tf.math.reduce_sum(solution)
# compute the gradient of the summed solution with respect to A, b
gradA, gradb = tape.gradient(summed_solution, [A_tf, b_tf])
print(gradA)
print(gradb)
In that code I want to calculate the gradients for A_tf and b_tf. In contrast to your example, b_tf is not directly fed to the cvxpylayer, but was "modified" by the dot product with a identity matrix of matching dimensions. So, b_iden should be equal to b_tf, but it is different for the backpropagation algorithm, since it needs to calculate the gradients via the MatMul Operation, which is not possible, because of the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute MatMul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:MatMul]
Is there some workaround, since neural networks after the cvxpylayers is not a problem, like you demonstrated by the ReLU example. There you do not need to feed a MatMul Operation with tf.float64.
Hello!
At first thank you for the opportunity to handle convex optimization problems via backpropagation in Tensorflow and Pytorch. It is a nice opportunity for combining machine and deep learning.
For my first tests with Cvxpylayers I took the Tensorflow example and modified it a bit. I constructed a neural network sequentially before the cvypylayer in forward pass (in the backward pass it is sequentially after the cvxpylayer). Unfortunately, I got the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute MatMul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:MatMul]
So, somehow the backpropagation via the cvxpylayer outputs a double tensor (tf.float64) which is not matching with the MatMul operation, processing only float tensors (tf.float32).
I simplified the problem in the following code:
In that code I want to calculate the gradients for A_tf and b_tf. In contrast to your example, b_tf is not directly fed to the cvxpylayer, but was "modified" by the dot product with a identity matrix of matching dimensions. So, b_iden should be equal to b_tf, but it is different for the backpropagation algorithm, since it needs to calculate the gradients via the MatMul Operation, which is not possible, because of the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute MatMul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:MatMul]
Is there some workaround, since neural networks after the cvxpylayers is not a problem, like you demonstrated by the ReLU example. There you do not need to feed a MatMul Operation with tf.float64.
Thank you for your Respond! Marcel