Open axsaucedo opened 4 years ago
It seems there is a stdlib
folder with some core functions, which have allowed me to get the spirv to compile. I still seem to have strange behaviour on the model as the outputs don't align with the C++ implementation, and given it's deterministic, the outputs should be the same.
However it does look quite positive, but would be good to get further clarity if the function extensions are in usable state, and whether there are best practices. Also whether there are any ideas why the behaviour is not the same as the shader provided above.
from kp import Tensor, Manager, Sequence, log_level
from pyshader import python2shader, f32, ivec3, Array, vec2
from pyshader.stdlib import exp, log
@python2shader
def compute_shader(
index = ("input", "GlobalInvocationId", ivec3),
x_i = ("buffer", 0, Array(f32)),
x_j = ("buffer", 1, Array(f32)),
y = ("buffer", 2, Array(f32)),
w_in = ("buffer", 3, Array(f32)),
w_out_i = ("buffer", 4, Array(f32)),
w_out_j = ("buffer", 5, Array(f32)),
b_in = ("buffer", 6, Array(f32)),
b_out = ("buffer", 7, Array(f32)),
l_out = ("buffer", 8, Array(f32)),
M = ("buffer", 9, Array(f32))):
i = index.x
m = M[0]
w_curr = vec2(w_in[0], w_in[1])
b_curr = b_in[0]
x_curr = vec2(x_i[i], x_j[i])
y_curr = y[i]
z_dot = (w_curr.x * x_curr.x) + (w_curr.y + x_curr.y)
z = z_dot + b_curr
y_hat = 1.0 / (1.0 + exp(-z))
d_z = y_hat - y_curr
d_w = (1.0 / m) * x_curr * d_z
d_b = (1.0 / m) * d_z
loss = -(y_curr * log(y_hat) + ((1.0 + y_curr) * log(1.0 - y_hat)))
w_out_i[i] = d_w.x
w_out_j[i] = d_w.y
b_out[i] = d_b
l_out[i] = loss
# First we create input and ouput tensors for shader
tensor_x_i = Tensor([0.0, 1.0, 1.0, 1.0, 1.0])
tensor_x_j = Tensor([0.0, 0.0, 0.0, 1.0, 1.0])
tensor_y = Tensor([0.0, 0.0, 0.0, 1.0, 1.0])
tensor_w_in = Tensor([0.001, 0.001])
tensor_w_out_i = Tensor([0.0, 0.0, 0.0, 0.0, 0.0])
tensor_w_out_j = Tensor([0.0, 0.0, 0.0, 0.0, 0.0])
tensor_b_in = Tensor([0.0])
tensor_b_out = Tensor([0.0, 0.0, 0.0, 0.0, 0.0])
tensor_l_out = Tensor([0.0, 0.0, 0.0, 0.0, 0.0])
tensor_m = Tensor([ 5.0 ])
# We store them in an array for easier interaction
params = [tensor_x_i, tensor_x_j, tensor_y, tensor_w_in, tensor_w_out_i,
tensor_w_out_j, tensor_b_in, tensor_b_out, tensor_l_out, tensor_m]
mgr = Manager()
# Initalise GPU buffer and memory
mgr.eval_tensor_create_def(params)
ITERATIONS = 100
learning_rate = 0.1
# Perform machine learning training and inference across all input X and Y
for i_iter in range(ITERATIONS):
mgr.eval_tensor_sync_device_def([tensor_w_in, tensor_b_in])
mgr.eval_algo_data_def(params, compute_shader.to_spirv())
mgr.eval_tensor_sync_local_def([tensor_w_out_i, tensor_w_out_j, tensor_b_out, tensor_l_out])
# Calculate the parameters based on the respective derivatives calculated
w_in_i_val = tensor_w_in.data()[0]
w_in_j_val = tensor_w_in.data()[1]
b_in_val = tensor_b_in.data()[0]
for j_iter in range(tensor_b_out.size()):
w_in_i_val -= learning_rate * tensor_w_out_i.data()[j_iter]
w_in_j_val -= learning_rate * tensor_w_out_j.data()[j_iter]
b_in_val -= learning_rate * tensor_b_out.data()[j_iter]
# Update the parameters to process inference again
tensor_w_in.set_data([w_in_i_val, w_in_j_val])
tensor_b_in.set_data([b_in_val])
# Print learned parameters of logistic regression algorithm
print(tensor_w_in.data())
print(tensor_b_in.data())
print(tensor_l_out.data())
In this implementation all functions were removed, and were added in-line. The specialisation / constant was changed into a simple input parameter.
For completeness, I can confirm that if I run the same glsl shader outlined above with the Python script, I get the same results as with the C++.
To run the script above the only changes required on the GLSL script is to add the extra binding to account for the variable M that otherwise is provided as specialization constant - ie:
//...
layout(set = 0, binding = 9) buffer bMb { float Mb[]; };
float m = Mb[0];
//...
The spirv contents of this shader result in the same output as per the C++, so it seems there is either something wrong with the way I implemented the shader, or perhaps the instructions for exp
, log
or other components may not be correct.
Ok I have found the issue, it seems that there is a function provided to calculate the dot product, which is mainly using the @
operator. The manual implementation of the dot product in z_dot = (w_curr.x * x_curr.x) + (w_curr.y + x_curr.y)
was incorrect, and re-implementing it as z_dot = w_curr @ x_curr
fixed the issue.
Now both the C++ and Python implementation with PyShader have the same output for the Machine Learning Logistic Regression implementation.
Changed the focus of this issue towards documenting the math / standard lib functions, as it seems from this that it worked as required.
Nice to see your progressions like this ;)
Good point about the documentation. I think we finally need some proper docs on rtd.
Edit: Given this has been fixed, and works correctly with the stdlib functions, the issue can be focused mainly on documenting these functions as they are currently not documented.
I am currently trying to implement the machine learning logistic regression algorithm below (as explained in this blog post). Most of the shader can be restructured to some of the limitations (such as removing the functions), but it seems that the current blocker is that the
log(...)
function is not supported. It would be quite useful if built-in shader functions could be used in the pyshader function.The GLSL shader being implemented is below: