yandexdataschool / speech_course

YSDA course in Speech Processing.
MIT License
180 stars 56 forks source link

Problems in week_10_vqe_noise_reduction/homework.ipynb #28

Open zshrav opened 2 months ago

zshrav commented 2 months ago

Known problems:

SNR definition:

"Given a ground truth signal ... and its estimate ..., we define noise as ... . Slightly abusing notation we get: " In the math expression the numerator and the denominator should be swapped.

from vqe.data.mixing import RandomMixtureSampler

Just remove this line. It is an artifact of testing, which I forgot to remove.

class RandomMixtureSampler, method __call__:

        # input_signal and mic_signal should be multiplied by the same factor to match each other
        mult_signal = normalize_to_rms(
            signal_target, self.normalization_rms_db
        )

This snippet is wrong. Instead, it is supposed to calculate the multiplication factor here (that's why the variable is called mult_signal)

zshrav commented 2 months ago

Some more problems:

"num out channels: 64 the final output layers" in the model desctiption:

It's a misprint. We mean that each 2D convolutional layer should have 64 output channels, except for the final output layers which by design have 1 output channel as they estimate the real or imaginary component of the complex spectrum: 2 final layers total as there are 2 decoders.

class RandomMixtureSampler, method __call__:

        noise_rms_db = self.sample_noise_rms_db()
        mult_noise = normalize_to_rms(noise, noise_rms_db)
        noise *=  # your code

Instead it should be:

        noise_rms_db = self.sample_noise_rms_db()
        noise = normalize_to_rms(noise, noise_rms_db)
zshrav commented 2 months ago

run_epoch:

model.train(training) should be inserted