Closed ArchieGu closed 6 years ago
Is this the code I provided or did you make changes? Please be clear in the question as there are other people reading these threads.
EDIT: I think you are replacing the parameters I provided with your own parameters extracted from Pytorch. There is probably an issue with your weights, but without more details I can't debug it. I'm not sure what you mean by "changed the input value range [-1, 1]" because the CIFAR10 images I provide should already be in the range [-1, 1].
@rzhao01 Hi, sorry for bothered you again.
I can't pinpoint the issue but I would guess the problem is with changing the image range. How did you implement that change?
I would recommend just taking the data array on this line , and applying (x+1)/2 scaling to each element.
The method I use is provided by Pytorch tutorial.
Oh you made changes in your PyTorch script. I was confused and thought you changed the input range in my provided code.
I can't really help you debug the PyTorch training script, I'm not familiar with PyTorch and it's not my code to begin with. I can help you with my C++ code if you tell me exactly what changes you have made.
OK, you've been great help, thank you so much~
@rzhao01
Thanks for your answer.
thank you.
2*x - 1
.In ArchieGu's case, he said he trained on [0,1] images and his test inputs are in the same range, so this should not be an issue. I'm not sure what is wrong, I think there is another bug somewhere.
@Sun-xiaohui I think what @rzhao01 means is that if your input is in range of [-1,1], then nothing need to be changed, if your input is in range of [0,1] then reshape your input into [-1,1] and continue with the original code or you can use 2*x -1 to modify the fpga code to read the input in range [0,1]. After this you can use your parameters to test the code.
@ArchieGu Yes, whether you choose to add processing during training or processing on the code side of the board, the purpose of adding the formula conversion is to make the data range when training and the data range when testing on the board be the same.
@rzhao01 Hello, I trained 500 times using the Theano version code provided by Matth, and put out the corresponding parameters in the file of the code save path.
np.savez(save_path, *lasagne.layers.get_all_param_values(model))
However, after replacing the trained parameter file with the file in the param on the corresponding board, the error rate of the test result always fluctuates from 50% to 60%. I tried many training results and could not match the low error rate during training. Is there any detail that needs attention?
Did you make neccessary changes to the parameters? For the accelerator we make two changes: (1) remove biases, and (2) transform the batch norm parameters.
@rzhao01 Thanks for your reply. Yes, I reconfirmed the program code again, where the bias added a new parameter b=None. The file in the corresponding save path contains a total of 45 files. Each of the 5 files corresponds to w ,beta ,gamma ,mean ,inv_std. The parameter file in the corresponding Batchnorm is processed as follows:
beta = np.load("./theano/arr_1.npy")
gamma = np.load("./theano/arr_2.npy")
mean = np.load("./theano/arr_3.npy")
inv_std = np.load("./theano/arr_4.npy")
k = gamma / inv_std
h = beta - mean * gamma / inv_std
Take all the files to get w, k, h, and save them as the corresponding 27 files in order. Is there anything wrong with my two-step operation?
Thanks.
If you are using Lasagne, then inv_std is the reciprocal of the standard deviation, you should be multiplying instead of dividing by it.
You can test whether your kh calculation is working in Python. Simply write a new batch norm layer with k and h parameters, and have the layer return input * k + h
. Then you can test your modified parameters.npy before importing it in C++.
@rzhao01 Thanks for your reply. Yes, it should be multiplied by the inv_std variable. Why is the parameter k, h extracted from the corresponding batchnorm not directly used, but also try a test? The parameter variables obtained are consistent with the data dimensions in the params you provide. Is the process of data participation still different?
The code for batchnorm corresponding to k and h is as follows.
class MyBatchNorm(lasagne.layers.BatchNormLayer):
def __init__(self, incoming, k, h, axes='auto',**kwargs):
super(lasagne.layers.BatchNormLayer, self).__init__(incoming, **kwargs)
if axes == 'auto':
axes = (0,) + tuple(range(2, len(self.input_shape)))
elif isinstance(axes, int):
axes = (axes,)
self.axes=axes
self.k = k
self.h = h
def get_output_for(self, input, deterministic=False, **kwargs):
param_axes = iter(range(input.ndim - len(self.axes)))
pattern = ['x' if input_axis in self.axes
else next(param_axes)
for input_axis in range(input.ndim)]
k = self.k.dimshuffle(pattern)
h = self.h.dimshuffle(pattern)
#k = self.k
#h = self.h
# normalize
normalized = (input) * k + h
return normalized
The input here is a four-dimensional array [num,Channels,row,columns], but the corresponding k and h are one-dimensional constants, and both the gamma and beta parameters are Consistent, but in lasagne's official source code, gamma, beta can be substituted into the calculation. However k, h into the formula calculation will appear dimension mismatch problem, there is some confusion. What is the role of this batchnorm verification? thanks.
I don't quite understand your question - you'll have to explain what "data participation" means and what dimensions are mismatched.
Batch norm parameters are supposed to be 1-dimensional, each output feature map has a k and h, so k and h should be arrays with length equal to the output channels.
I suggested verifying the batch norm because you mentioned testing on the board - I wanted to make sure you tested the parameters in the Python script before importing them to FPGA.
@rzhao01 Thank you. I'm so sorry to puzzle you about the way the question is expressed. The part of the "data participation" means that the k,h parameter. Dimension mismatch is that when calculating input k + h, input is a 4-dimensional array, k,h is 1-dimensional, and input k has a problem of dimension mismatch when calculating directly. But after these two transformations
k = self.k.dimshuffle(pattern)
h = self.h.dimshuffle(pattern)
After that, the problem of dimension matching is gone.
Thank you for your continued support and we would like to express our sincerely thanks to you.
Figure 1: our own training results Figure 2: refers to your original data
Thanks.
No problem. So you made sure k and h worked in Python? Are you still having an issue on the FPGA board?
Yes, the converted data works on both Python and FPGA board. For the C++ code part you provided, after make, there are two versions of the executable program, which are cpu and fpga. But if i want to generate a program that can be run directly on the PS side of the FPGA chip, directly on the corresponding arm, without hardware acceleration, what efforts should I make? Thanks.
Good to hear!
You just need a software (C++) implementation of the BNN, so instead of marking top with the HLS pragmas in Accel.h, just write normal C++. In fact, the master
branch has some envvar switches to run the dense layers in software: here
OK, you've been great help, thank you so much~
Closing due to lack of activity. All questions seem to have been answered.
As you can see, the code only predicts label 7, and we don't quite sure where the problem is.
We use the parameters extracted from Pytorch. And we also changed the input value range [-1,1] as the Theano version did.