DATA PRE-PROCESSING - Githubissues

kosta-pmf / dnn-audio-watermarking

DNN-based audio watermarking

GNU General Public License v3.0

40 stars 10 forks source link

DATA PRE-PROCESSING #8

Closed elchaima1234 closed 10 months ago

elchaima1234 commented 10 months ago

I want to try your model with my own dataset, but I am facing some difficulties regarding data preprocessing. What are the steps to follow before feeding the data into the model? Currently, my PESQ is 4.2, but when I reconstruct the audio file and listen to it, the quality is too bad. During training, the PESQ is consistently mentioned as 4.2 but the reality is not. NB: I use librosa to load and resize and converting the data into stft, how i can fix this problem

kosta-pmf commented 10 months ago

This is also a problem that we have encountered. The problem could be in the STFT itself. It allows the network to insert the watermark in a short interval of the audio, thus creating a croaking artifact. PESQ remains at high values overall, but a part of the signal is seriously affected. You could try to mitigate it by reducing the length of the audio interval used for watermarking, or by removing some of the attacks. We are dealing with this problem in our next model.