Closed sbenthall closed 5 years ago
I believe that staying in the amplitude space will result in the input matrix values being complex numbers.
Tensorflow supports complex numbers as a data type:
https://www.tensorflow.org/api_docs/python/tf/dtypes/complex
For visualizing in the complex plane: https://stackoverflow.com/questions/17044052/mathplotlib-imshow-complex-2d-array
There is some loss of signal in the roundtrip of a song from mp3 -> wav -> numpy -> wav (-> mp3).
On possibility is that this is being lost in the conversion from amplitude to decibel spaces. Another is that it's being lost in the short term Fourier transform step.
Experiment with