GuitarML / GuitarLSTM

Deep learning models for guitar amp/pedal emulation using LSTM with Keras.
https://guitarml.com/
GNU General Public License v3.0
366 stars 49 forks source link

Could you please post some sound samples? #1

Closed Tylersuard closed 3 years ago

Tylersuard commented 3 years ago

Hello! I'm interested in this project, but I want to know how it sounds before I commit to downloading it. Do you have any samples or a youtube video you could post in the readme?

mishushakov commented 3 years ago

Hey Tyler, sounds f*cking amazing

out_model_name_high_signal_comparison_e2s_0 0132

in the archive you'll find

note that this one took less than 1 minute (but lots of memory) to train the wavenet model took hours and the loss was 10x as high

Archive.zip

GuitarML commented 3 years ago

@mishushakov I know! It's pretty incredible, I couldn't believe it the first time I got it running. I think this is close to what Neural DSP is using in their Quad Cortex for neural capture.

Thanks for sharing the samples, I'll probably make a YouTube video soon as well.

mishushakov commented 3 years ago

i've been wondering whether they actually train on device this would be pretty incredible

i know Apple does, by using a dedicated chip called Apple Neural Engine https://venturebeat.com/2019/06/03/apple-debuts-core-ml-3-with-on-device-machine-learning/

38github commented 3 years ago

I have been training during the night and I find it amazing. I am still tinkering with the parameters though.

Is it possible to write a log after training so that we can see what (at least) each split and more got inloss: x.xxxx - error_to_signal: x.xxxx - val_loss: x.xxxx - val_error_to_signal: x.xxxx so that it is possible to more easily compare runs?

Really looking forward to all the achivments with this plugin and application!

mishushakov commented 3 years ago

@38github try to pipe the output into a file https://askubuntu.com/a/731237

i'd love to see what you'll do with it too!

38github commented 3 years ago

@38github try to pipe the output into a file https://askubuntu.com/a/731237

i'd love to see what you'll do with it too!

Thank you! I will see what I can do with it or some other solution as I right now am using it on a Windows computer. I will setup the environment/download Miniconda3 on my Linux computer too and use it there.

38github commented 3 years ago

I did notice one interesting thing when changing between --training_mode=1 and --training_mode=2. One split_data that mode 2 worked quite good on mode 1 failed miserably on:

--training_mode=2

Training on split data 3/10
Epoch 1/10
109/109 [==============================] - 83s 762ms/step - loss: 0.1909 - error_to_signal: 0.1908 - val_loss: 0.0826 - val_error_to_signal: 0.0825
Epoch 2/10
109/109 [==============================] - 86s 794ms/step - loss: 0.0675 - error_to_signal: 0.0675 - val_loss: 0.0572 - val_error_to_signal: 0.0572
Epoch 3/10
109/109 [==============================] - 85s 780ms/step - loss: 0.0520 - error_to_signal: 0.0520 - val_loss: 0.0481 - val_error_to_signal: 0.0481
Epoch 4/10
109/109 [==============================] - 82s 750ms/step - loss: 0.0451 - error_to_signal: 0.0451 - val_loss: 0.0424 - val_error_to_signal: 0.0424
Epoch 5/10
109/109 [==============================] - 82s 755ms/step - loss: 0.0407 - error_to_signal: 0.0407 - val_loss: 0.0393 - val_error_to_signal: 0.0393
Epoch 6/10
109/109 [==============================] - 85s 783ms/step - loss: 0.0379 - error_to_signal: 0.0379 - val_loss: 0.0366 - val_error_to_signal: 0.0366
Epoch 7/10
109/109 [==============================] - 82s 751ms/step - loss: 0.0356 - error_to_signal: 0.0356 - val_loss: 0.0347 - val_error_to_signal: 0.0348
Epoch 8/10
109/109 [==============================] - 82s 751ms/step - loss: 0.0339 - error_to_signal: 0.0338 - val_loss: 0.0329 - val_error_to_signal: 0.0329
Epoch 9/10
109/109 [==============================] - 82s 755ms/step - loss: 0.0324 - error_to_signal: 0.0324 - val_loss: 0.0315 - val_error_to_signal: 0.0315
Epoch 10/10
109/109 [==============================] - 83s 758ms/step - loss: 0.0311 - error_to_signal: 0.0311 - val_loss: 0.0304 - val_error_to_signal: 0.0305

--training_mode=1 (even though more split_datas and epochs:

Training on split data 3/10
Epoch 1/15
109/109 [==============================] - 53s 491ms/step - loss: 10.6607 - error_to_signal: 10.6535 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 2/15
109/109 [==============================] - 53s 489ms/step - loss: 1.0004 - error_to_signal: 1.0004 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 3/15
109/109 [==============================] - 52s 481ms/step - loss: 1.0005 - error_to_signal: 1.0005 - val_loss: 1.0003 - val_error_to_signal: 1.0003
Epoch 4/15
109/109 [==============================] - 52s 477ms/step - loss: 1.0004 - error_to_signal: 1.0004 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 5/15
109/109 [==============================] - 52s 477ms/step - loss: 1.0006 - error_to_signal: 1.0006 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 6/15
109/109 [==============================] - 52s 474ms/step - loss: 1.0009 - error_to_signal: 1.0009 - val_loss: 1.0002 - val_error_to_signal: 1.0002
Epoch 7/15
109/109 [==============================] - 52s 475ms/step - loss: 1.0010 - error_to_signal: 1.0010 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 8/15
109/109 [==============================] - 57s 525ms/step - loss: 1.0011 - error_to_signal: 1.0011 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 9/15
109/109 [==============================] - 53s 484ms/step - loss: 1.0013 - error_to_signal: 1.0013 - val_loss: 1.0000 - val_error_to_signal: 1.0000
Epoch 10/15
109/109 [==============================] - 52s 480ms/step - loss: 1.0008 - error_to_signal: 1.0008 - val_loss: 1.0008 - val_error_to_signal: 1.0007
Epoch 11/15
109/109 [==============================] - 54s 491ms/step - loss: 1.0007 - error_to_signal: 1.0007 - val_loss: 1.0001 - val_error_to_signal: 1.0001
Epoch 12/15
109/109 [==============================] - 53s 484ms/step - loss: 1.0006 - error_to_signal: 1.0006 - val_loss: 1.0002 - val_error_to_signal: 1.0002
Epoch 13/15
109/109 [==============================] - 53s 484ms/step - loss: 1.0010 - error_to_signal: 1.0010 - val_loss: 1.0025 - val_error_to_signal: 1.0026
Epoch 14/15
109/109 [==============================] - 53s 489ms/step - loss: 1.0016 - error_to_signal: 1.0016 - val_loss: 1.0034 - val_error_to_signal: 1.0034
Epoch 15/15
109/109 [==============================] - 52s 481ms/step - loss: 1.0011 - error_to_signal: 1.0011 - val_loss: 1.0036 - val_error_to_signal: 1.0037

I guess mode 2 is the way to go even though it takes ~60% additional amount of time.

38github commented 3 years ago

It seems when using too little amount of calculations and precision it generates a lot of harsh higher frequencies that are dominant. When increasing the precision (but not enough, apparently) the harsh higher frequencies decrease and the lowest frequencies are missing. I am now looking at what settings are needed to achive no loss of the lower frequencies.

Too bad my CUDA's are of too old version and I have to use the CPU.

mishushakov commented 3 years ago

@38github the loss is way too high try my colab: https://colab.research.google.com/drive/1KheBdJ_UhHBC3BSsFbD00PUIOB14ifee?usp=sharing it uses a Tesla v100 GPU

38github commented 3 years ago

@38github the loss is way too high try my colab: https://colab.research.google.com/drive/1KheBdJ_UhHBC3BSsFbD00PUIOB14ifee?usp=sharing it uses a Tesla v100 GPU

Thank you for mentioning Colab. I have now managed to set it up and I am now running a max_epochs=1000 split_data=8 --training_mode=1 --input_size=150 session on a two minute file of different materials running through a Really Nice Leveling Amp 7230. I really look forward to hear the result.

mishushakov commented 3 years ago

ok, cool

with this type of neural network you don't actually need that much to get results (unless you want better than excellent quality)

here's listeners feedback of various models (made from two different amplifiers)

applsci-10-00766-g013 applsci-10-00766-g014

the number represents the amount of hidden units

mode 0 would be 36 mode 1 would be 64 mode 2 would be 96

also note that

Increasing the hidden size generally results in the model being more accurate, however it increases the number of learnable parameters in the network, as well as the processing power required to run it

source: https://www.mdpi.com/2076-3417/10/3/766/htm

38github commented 3 years ago

Thank you for those diagrams! I am after really good result and where compression is emulated and low frequencies do not "fart out".

38github commented 3 years ago

The result was not good at all. Really distorted which is strange. Also the loss was fluctuating like crazy. The fluctuation seem to have stop now that I have increased the input_size from 150 to 250. We'll see what I get out of that.

By the way. It the last step where the program pops up an image of the result stopping Google Colab from writing the last image files? On the computer I have to close the pop up image to make the program write those files (I think) but I have no clue how to solve it in Colab.

38github commented 3 years ago

Using mode 2 has improved the sound and the extreme distortion has improve to distortion mostly where the most compression artifacts were supposed to be reconstructed. Changing learning_rate from 0.0005 to 0.0001 has improved he sound more and I can now start to hear some of the compression and the distortion is less but still a problem. I will continue to try out parameters. rnla_50epochs_mode2_8splits_inputsize300_lrate0p0001_v02.zip

I will increase the epochs. It is also interesting that when the loss is lower than 0.009 the values turn into 9.xxxx-40 or something.