Open Samantha-Zhan opened 2 years ago
As far as I know, you have a candidate architecture implemented. Where can I find the code? @Samantha-Zhan
I am encountering an error when I am running our deep learning model (image below, mainly is the dimension does not match for the concat layer). So I tried debugging and found the cause of it (the second and third dimension of in1 is inverted after passing through several layers, 128 is our mini-batch size) Then I was trying to apply permute function to in1 dlarray inside my custom reshpe layer, right before concat layer (custom concat is only there for testing purposes) But due to matlab can't allow permutation across different channel (error below) I am really stuck now and not sure what to do. (I did also tried converting it to logical array and then back to dlarray but encountered error with dlgradient; so I concluded it is quite risky messing with differentiation and it might break the model)
Code link: https://github.com/YilikaLoufoua/noise-suppression/tree/custom-training-loop/model Problem: output of a custom layer got changed immediately after the layer returns Detail: In the last part of our model, we wrote a custom layer normalLayer that does a series of manipulation to the input array. By setting a breakpoint inside the end of the custom layer, we know the output is in fact 257 313 32. However, when we check the dimension again inside checkdimsLayer (a custom layer that does nothing to the sequence but just provides a space to put breakpoint), we notice the dimension got changed by matlab to 32 1 313, with the Batch dimension reduced from 257 to 1. Suspicion: Our thought was that it has something to do with our dimension labelling inside the norm layer. Since we used reshape here to reduce down to 3D, reshape removes dimension label, so we had to manually add back the dimensions. It might be the way matlab processes dlarray after each layer that somehow removes all elements in the batch dimension. What we tried: We have tried variable permutations of the letters B, C, T, S but none produces the correct result and we couldn't really tell what is happing inside Matlab's interworkings. Thus, we are quite stuck at this step.
normLayer -> CustomLayerAutodiffStrategy.m forwardExampleInputs function -> CustomLayerStrategy.m convertOutputsToPlaceholderArrays function -> CustomLayerStrategy.m iGetInputBatchSizeAndSequenceLength function
[UPDATE] Dimension issue of normLayer (sort of?) resolved using unfold/fold layer of Matlab here [CURRENT ISSUE TO SOLVE] The input does not meet the regression layer input requirement [WHAT WE TRIED] We modified the fully connected layer output size temporarily to 1, so that after simple reshaping in final layer it gives 257(S) × 311(C) × 1(B) × 1(T) instead of 257(S) × 311(C) × 1(B) × 2(T). We also checked the input dimension to the network and it is 257(S) × 311(C) × 1(B) × 1(T), which matches exactly. However, only the first error goes away, second error remains(not sure what exactly is "expected input size" for regression layer?).
[UPDATE] Regression layer mismatch issue resolved by removing T dimension labelling here
[MY FIX OF THE ISSUE] By going line by line into the source code, I found the function that throws the error(see below). By analyzing the code, I noticed it wouldn't throw an error if the input&output size matches exactly OR no T dimension. So all I did was that I changed the T labelling in the finalLayer to S. But then, as you said, I also have to change the fc_2 output from 2 to 1 for it to run with no error.
[RESULT AND NEW ISSUE] after spending hours resolving some other backpropagation error, the model is finally able to train! However, it is training at a very slow speed (900min 152 iteration). And the training result is not very good but still decreasing (unit for bottom loss is 10*4)
[POSSIBLE SOLUTION] Giancarlo suggested this morning to compare the training result with running the researcher's PyTorch code. This can clarify whether there is something wrong with our model or just expected behaviour.
Design the deep learning network using Deep Learning Toolbox. There are many network architectures to choose from. The two most common ones are convolutional neural networks, or CNN, ([5], [6]), and recurrent neural networks, or RNN ([2], [3], [4]). Your solution may require applying signal processing at the input or output of your network (for example, for signal pre-processing, feature extraction, or time-frequency transformation). In that case, you can use Audio Toolbox and Signal Processing Toolbox™ functionalities