Multiclasstraining on PC

konegen commented 2 years ago

Hi I want to train a NN for the MNIST dataset on the PC. How can I implement a training of a NN with multiple output neurons. When I am editing the example (link) and change the output layer to 10 neurons the inference of the model stops with an error (Process returned -1073741819 (0xC0000005) execution time : 0.485 s) Are there any changes I need to be especially aware of?

PierreGembaczka commented 2 years ago

Hi @konegen,

is it possible that you have not adjusted the output tensor and the target tensor? Just increasing the number of neurons in the output layer is not enough.

Here is an example with two outputs: https://github.com/Fraunhofer-IMS/AIfES_for_Arduino/blob/main/examples/0_Universal/0_XOR/2_XOR_training_2_outputs/2_XOR_training_2_outputs.ino

Check out our new tutorial, it explains the tensors in detail. A tutorial on training will also be coming soon. https://create.arduino.cc/projecthub/aifes_team/aifes-inference-tutorial-f44d96?ref=user&ref_id=1924948&offset=0

I guess you want to use a softmax in the output or? There is also an example here: https://github.com/Fraunhofer-IMS/AIfES_for_Arduino/blob/main/examples/1_Nano_BLE_Sense/0_Color_detection/creation_and_training.ino

For softmax please use the crossentropy as loss.

Here is an example how it could look like. If you have more than two inputs, you have to adjust the input tensor as well.

float target_data[4][10] = { {1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f ,1.0f}, {0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f ,0.0f}, {1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f ,1.0f}, {1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f ,1.0f} }; uint16_t target_shape[] = {4, 10}; aitensor_t target_tensor;
target_tensor.dtype = aif32;
target_tensor.dim = 2;
target_tensor.shape = target_shape; target_tensor.data = target_data;

float output_data[4][10]; uint16_t output_shape[] = {4, 10}; aitensor_t output_tensor; output_tensor.dtype = aif32; output_tensor.dim = 2; output_tensor.shape = output_shape; output_tensor.data = output_data; // // // // // And here a code snippet how the softmax should look like. It's from the PC tutorial // // Output dense layer ailayer_dense_t dense_layer_2;
dense_layer_2.neurons = OUTPUTS;

ailayer_softmax_t output_layer_activation_softmax;

ailoss_crossentropy_t crossentropy_loss;

// -----Define the structure of the model --------- aimodel_t model;
ailayer_t *x;
// Passing the layers to the AIfES model model.input_layer = ailayer_input_f32_default(&input_layer); x = ailayer_dense_f32_default(&dense_layer_1, model.input_layer); x = ailayer_sigmoid_f32_default(&sigmoid_layer_1, x); x = ailayer_dense_f32_default(&dense_layer_2, x); x = ailayer_softmax_f32_default(&output_layer_activation_softmax, x); model.output_layer = x;

// Add the loss to the AIfES model model.loss = ailoss_crossentropy_f32_default(&crossentropy_loss, model.output_layer);

aialgo_compile_model(&model); // Compile the AIfES model // // // Many greetings Pierre

konegen commented 2 years ago

Hi @AIfES-Pierre, thank you for your quick response. I found out from your suggestion that I had not defined the output tensor correctly. I had only defined the output tensor for the number of output neurons and not the number of training data.

PierreGembaczka commented 2 years ago

Hi @konegen ,

please let me know if it worked.

It would also be great if you would share the results with us. 😀

konegen commented 2 years ago

Hi @AIfES-Pierre, I am trying to calculate a MNIST model (10000 training data) with AIfES on the PC. The training is now running through. However, the loss is over 10 million. What could be the reason that the loss is so high?

PierreGembaczka commented 2 years ago

Hi @konegen,

there must be a mistake somewhere. Can you please post your AIfES network configuration?

Do you use softmax as activation function?

konegen commented 2 years ago

I just used the code of the XOR example and adjusted this one. The structure of the model looks like follows (if that's what you meant):

    uint16_t input_layer_shape[] = {1, 784};          // MNIST images of shape 28x28 got flatted
    ailayer_input_t input_layer;                   
    input_layer.input_dim = 2;                   
    input_layer.input_shape = input_layer_shape;  

    // Dense layer (hidden layer)
    ailayer_dense_t dense_layer_1;               
    dense_layer_1.neurons = 10;                    
    ailayer_sigmoid_t sigmoid_layer_1;             

    // Output dense layer
    ailayer_dense_t dense_layer_2;                  
    dense_layer_2.neurons = 10;                      
    ailayer_softmax_t softmax_layer_2;            

    ailoss_crossentropy_t crossentropy_loss;    

    // --------------------------- Define the structure of the model ----------------------------

    aimodel_t model;
    ailayer_t *x;

    // Passing the layers to the AIfES model
    model.input_layer = ailayer_input_f32_default(&input_layer);
    x = ailayer_dense_f32_default(&dense_layer_1, model.input_layer);
    x = ailayer_sigmoid_f32_default(&sigmoid_layer_1, x);
    x = ailayer_dense_f32_default(&dense_layer_2, x);
    x = ailayer_softmax_f32_default(&softmax_layer_2, x);
    model.output_layer = x;

    // Add the loss to the AIfES model
    model.loss = ailoss_crossentropy_f32_default(&crossentropy_loss, model.output_layer);

    aialgo_compile_model(&model); // Compile the AIfES model

Thanks, Daniel

PierreGembaczka commented 2 years ago

Hi @konegen ,

your code looks good, that should fit. And the loss is also correct. 😀

AIfES does not make any reduction / normalization for the loss. It is so high because you have so much training data.

In AIfES, the loss is not further processed in the mse and in the crossentropy. We will probably change this in the next update, because most people know the loss with reduction from e.g. Keras.

Here is a link to the Keras documentation where the reduction is explained. It is automatically activated, but can also be deactivated: https://www.tensorflow.org/api_docs/python/tf/keras/losses/Reduction

In AIfES you can easily add it. In the for loop before the loss is printed, you add the following line: //DATASETS is your number of training sets and OUTPUTS is the number of output neurons loss = loss / (OUTPUTS * DATASETS); printf("%f\n",loss);

So how do you read into the MNIST database? I tested it today and found this: https://github.com/takafumihoriuchi/MNIST_for_C

Works great but you have to convert the image data to float because they are stored in double. And the labels still need to be adjusted. If I can do it, I will upload an example in the next days.

Pierre

konegen commented 2 years ago

Hi @AIfES-Pierre,

Thanks for the hint. I read in the MNIST dataset in a Python script using the tensorflow.keras.datasets.mnist library. Then I flatten the data into a 1D array and write it to an .h file. I then include this in the C program.

Thanks, Daniel

PierreGembaczka commented 2 years ago

Hi @konegen,

another short explanation why we do not process the loss further. A division would have to be carried out in each case, which is very computationally intensive on microcontrollers. Each framework has its own technique.

There are often discussions about this topic. Here for example about the MSE: https://stats.stackexchange.com/questions/313070/mse-formula-in-neural-network-applications

konegen commented 2 years ago

Hi @AIfES-Pierre,

I have created three example scenarios for the MNIST dataset for AIfES on the PC:

MNIST dataset is passed and the net is trained and tested in AIfES from scratch.
The net is trained in Keras, weights are loaded and the net is executed.
The net is trained in Keras, weights are loaded and retrained and tested again in AIfES.

I created a new branch in my forked repository, uploaded these three examples and made a pull request. If you want you can look at the examples and include them.

Many greetings Daniel

PierreGembaczka commented 2 years ago

Hi @konegen ,

thanks for the great example. 😀

We have decided internally that this repository will remain exclusive to Arduino, because it will be installed via the Arduino library manager. Compatibility checks are done for this and code for PC can be problematic here.

But we will create an official AIfES example repository. 😀 Here examples for different platforms can be collected. We would be happy if you upload your example there.

Fraunhofer-IMS / AIfES_for_Arduino

Multiclasstraining on PC #6