GiorgosXou / NeuralNetworks

A resource-conscious neural network implementation for MCUs
MIT License
70 stars 21 forks source link

Multiple bias per layer #11

Closed expeon07 closed 3 months ago

expeon07 commented 2 years ago

Hi, I'm trying to use your library for TinyML in the Arduino Uno. I have a pre-trained autoencoder model with multiple biases per layer. Do you have an example on how to use this for the library? I only see examples with 1 bias per layer.

Thank you!

GiorgosXou commented 2 years ago

At the moment we are talking, the library doesn't support this feature but it seems pretty easy to implement, so I might probably add it.

(To be honest with you, I didn't know that such a neural-network with multiple biases per layer existed.)

I'm mainly self taught on this subject and usually it takes me quite a while to fully grasp and carefully implement some ideas, so if you could provide me with some links or insights about the subject, that would be great!

In conclusion, (just by curiosity) my question is: are multiple biases per layer in your model really needed? And if yes why?

expeon07 commented 2 years ago

Hi, thanks for the quick response. I think it's common to have a bias vector per layer or one (float) value per neuron in the layer. I was checking out this implementation as well. I extracted my weights and biases in the same way and indeed it outputted one bias per neuron on each layer.

https://github.com/hollance/TinyML-HelloWorld-ArduinoUno/blob/master/train_hello_world_model.ipynb

GiorgosXou commented 2 years ago

(sorry for this late reply, my phone screen just broke and I had to deal with that issue first)

I see... I'll do my best to bring a new version as soon as possible

pjurczen commented 4 months ago

I've just trained a neural network using tensorflow in python and that's kinda standard behaviour, you get separate bias terms for each node. I'd love to see this feature added! Without it it's kinda not possible to use pre-trained networks from popular libraries like pytorch or tensorflow without effort to write custom code to use one bias while training.

Here's an example how many parameters tensorflow calculates for a 1 x 5 x 3 (3 layers) network:


Layer (type) Output Shape Param #

L1 (Dense) (None, 1) 2

L2 (Dense) (None, 5) 10

L3 (Dense) (None, 3) 18

================================================================= Total params: 30 Trainable params: 30 Non-trainable params: 0

GiorgosXou commented 3 months ago

@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead: screenshot_image

Click to expand Tensorflow example ```python from tensorflow.keras.optimizers import Adam from tensorflow.keras.callbacks import LearningRateScheduler import tensorflow as tf import numpy as np def lr_schedule(epoch, lr): if epoch < 2000: return 0.01 elif epoch < 4000: return 0.001 elif epoch < 7000: return 0.0001 else: return 0.00001 tf.keras.backend.set_floatx('float32') # Define the XOR gate inputs and outputs input_size = 3 inputs = np.array([[ 0, 0, 0 ], [ 0, 0, 1 ], [ 0, 1, 0 ], [ 0, 1, 1 ], [ 1, 0, 0 ], [ 1, 0, 1 ], [ 1, 1, 0 ], [ 1, 1, 1 ]], dtype = np.float32) outputs = np.array([[0], [1], [1], [0], [1], [0], [0], [1]], dtype = np.float32) # Create a simple convolutional neural network model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(input_size,)), # Input layer (no bias) tf.keras.layers.Dense(5, activation='sigmoid', use_bias=False), # Dense layer with 2 units and tanh activation tf.keras.layers.Dense(1, activation='sigmoid', use_bias=False) # Output layer with 1 unit and sigmoid activation (binary classification) ]) # Compile the model optimizer = Adam(learning_rate=0.01) model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy']) # Train the model lr_callback = LearningRateScheduler(lr_schedule) model.fit(inputs, outputs, epochs=9000, callbacks=[lr_callback], verbose=0) # proof of concep :P # Evaluate the model on the training data loss, accuracy = model.evaluate(inputs, outputs) print(f"Model accuracy: {accuracy * 100:.2f}%") # Predict XOR gate outputs predictions = model.predict(inputs) print("Predictions:") for i in range(len(inputs)): print(f"Input: {inputs[i]}, Predicted Output: {predictions[i][0]:.7f}") # Print weights print() all_layers_units = [input_size] + [layer.units for layer in model.layers] # input_size because `keras.Sequential` optimizes it by merging it with the dense on for l in range(0,len(all_layers_units)-1): input_units = all_layers_units[l] output_units = all_layers_units[l+1] weights = model.layers[l].get_weights()[0] print(f"// LAYER: {l} -> {l+1}:") for j in range(0,output_units): for i in range(0,input_units): print(f"{weights[i][j]:.7f}", end=', ') print() print() ```
GiorgosXou commented 3 months ago

btw, fun realization I had... tensorflow by default returns the weights of each layer in a 2D-matrix of i*j instead of j*i: image

I chose the reverse way for optimization reason.

GiorgosXou commented 3 months ago

"Life is suffering"

pjurczen commented 3 months ago

@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead: screenshot_image

Click to expand Tensorflow example

Hey, thanks for the reply! I ended up implementing the forward feed myself for the small network that I was using.