Closed assyoucan closed 5 years ago
I am not sure if I understand your question. But I you check the paper, they used 1D-CNN to process the 2D spectrum, the first dimension becomes 1, the second dimension becomes time domain frames and the third dimension becomes the first dimension in the 2D spectrum, if my memory is correct.
No follow-up response. Close the thread.
Why does the dimension increase after mcep is output from the network? What is the added dimension?