first attempt of mono downmix in the magnitude domain

keunwoochoi / torchaudio-contrib

A test bed for updates and new features | pytorch/audio

169 stars 22 forks source link

first attempt of mono downmix in the magnitude domain #45

Open faroit opened 5 years ago

faroit commented 5 years ago

Downmixing in the magnitude domain is the recommended way for multichannel audio, since its energy preserving. Let me know if the API makes sense here.

keunwoochoi commented 5 years ago

The specific implementation seems alright in general. But probably a little more high-level question - are we gonna have waveform downmix, too?

Ugh, out of sudden, I felt like what if we have data types of WaveformTensor and SpectrumTensor with which we can do more like functional stuff e.g.,

x = WaveformTensor(batch_audio)
x.spectrogram().downmix().amplitude_to_decibel()

Ok none of this comment is not relevant to this commit :P

keunwoochoi commented 5 years ago

Hey, in #49, I tried to have this beta_something.py. Maybe without a test, this approach could make sense here, too? (not necessarily though)

keunwoochoi commented 5 years ago

Oops? Have no idea why all the commits I made to a different branch became a part of here. But thanks! Here're some comments.

I understood we agreed that modifying Spectrogram API (mono) is a separate issue -- and indeed it is, so it should be completely removed here.
On the name, I commented here and am still of the opinion (hence SpectrumDownmix and WaveformDownmix). And there, because 'waveform' is a noun, I'd prefer Spectrum to Spectral.
For waveform downmix, given that the current torchaudio is going to be changed a lot + this type conversion doesn't seem alright for me, probably it's better to implement our own functional here? Which will be simply..
```
return torch.mean(waveforms, ch_dim, True)
```
. Also, for a batch tensor, let's use a plural form!

faroit commented 5 years ago

Oops? Have no idea why all the commits I made to a different branch became a part of here. But thanks! Here're some comments.

should be fine now. There was a missbehaved rebase going on after I pulled in the recent master.

Regarding you requests, I will update those.

faroit commented 5 years ago

DownmixSpectrum would also work on power_specgrams. Should I rename the input for this to just specgrams then?

keunwoochoi commented 5 years ago

I don't think so. By having power= args we're expecting the input to be mag_specgrams.

keunwoochoi commented 5 years ago

(icymi, we didn't really put it on a vote, but DownmixSpectrum --> SpectrumDownmix? especially if you agree, which is kinda voting :)

faroit commented 5 years ago

(icymi, we didn't really put it on a vote, but DownmixSpectrum --> SpectrumDownmix? especially if you agree, which is kinda voting :)

ha, yep, sorry... let vote first ;-)

faroit commented 5 years ago

are the tests added in the master stable? In that case I can provide some for the downmix functions as well

keunwoochoi commented 5 years ago

Yes I think so.

keunwoochoi commented 5 years ago

https://github.com/keunwoochoi/torchaudio-contrib/issues/54#issuecomment-499239872 We got the names :) DownmixWaveform and DownmixSpectrum.

keunwoochoi commented 5 years ago

@faroit Hey, I'm a bit lost, but seems like we resolved all the naming issues as well as the implementations? Is it ready to be merged now?

faroit commented 5 years ago

Well, I want to add unit tests but I am confused now if we could stick with pytest or not?

keunwoochoi commented 5 years ago

I see, +1 for unit tests. I think we have to come back to unittest, at least for some 'core' features that we plan to make a PR there, and this one would be definitely core.