Closed napulen closed 2 years ago
The sizes of the network are:
Concatenate
: 94,679 trainable parameters
Add1
: 89,431 trainable parameters
Add2
: 84,567 trainable parameters
One could make a case for saving those 5k parameters in the Add1
variation, as long as the performance is not detrimental. I think it is not worth it and it's best to keep the Concatenate
.
I spoke too soon, Add2
seemed to have provided similar or slightly better results than Concatenate
after the full 100 epochs.
Not in all tasks, but in several of them.
Here is the mlflow
comparison. Left is Add2
, right is Concatenate
Sticking to Concatenate
for now. Also removed Dropout
. The network is on its vanilla state now.
As per the paper, I concatenate the
Bass
andChroma
convolutional blocks. I never tried adding them instead concatenating them.The main difference is that adding results in a smaller network (less parameters), and the performance could be very similar.
I ran two experiments,
Add
operation and the sameDense
layer configurationAdd
operation and a smallerDense
layer configuration1
seemed either slightly worse or not different than theConcatenate
version2
seems definitely worse than theConcatenate
version