I'm a bit stuck, trying to implement a custom loss function in C++ (with a custom gradient function as well).
Basically, I want to calculate || z_i - mu_j||^2 . z_i has shape (N_in) and is the input variable (i is the index into the minibatch - z_3 is then the third entry in the minibatch). mu_j is the j-th column of a matrix of shape (N_in, M), and is not indexed by the minibatch. Having the CNTK backend broadcasting the Minus operator between z and mu, in the presence of z that is a minibatch (with the corresponding dynamically sized axis) is what I want.
in python, the following works:
z = argument.reshape(num_batches, 1, output_dim)
num_centroids = mu.shape[0]
num_batches = z.shape[0]
z_minus_mu = z - mu
sq_distances_j = np.sum(z_minus_mu_sq, axis=2).reshape(num_batches,num_centroids,1)
q_j_nom_inverse = 1 + sq_distances_j
[...]
I'm trying to make the same work in C++, but am a bit lost in the somewhat sparsely documented C++ API.
Is it possible to change np.sum to cntk.reduce_sum? Then it would more straightforward to convert to C++ (and faster for GPU by avoiding CPU/GPU synchronization)
I'm a bit stuck, trying to implement a custom loss function in C++ (with a custom gradient function as well).
Basically, I want to calculate || z_i - mu_j||^2 . z_i has shape (N_in) and is the input variable (i is the index into the minibatch - z_3 is then the third entry in the minibatch). mu_j is the j-th column of a matrix of shape (N_in, M), and is not indexed by the minibatch. Having the CNTK backend broadcasting the Minus operator between z and mu, in the presence of z that is a minibatch (with the corresponding dynamically sized axis) is what I want.
in python, the following works:
I'm trying to make the same work in C++, but am a bit lost in the somewhat sparsely documented C++ API.
Greetings, Indriði