keunwoochoi commented 5 years ago

Hi all, This is an important announcement that I'm glad to make. As we hoped from the very beginning of this repo aka -contrib, there will be a big, breaking change of torch/audio, and many things that we're working on here would be probably a big part of it. It was our very first motivation when we started this whole thing with me and @faroit :) Just to give a context, torch/audio is going to be improved in many ways and some of them are directly related to -contrib.

Phase 1

Phase 1 is the last update before freezing the current torch/audio. In this phase, we're going to add STFT, ComplexNorm, Spectrogram and StretchSpecTime from -contrib to torch/audio. The freezing would better to happen asap, I expect we'll top-prioritize to finalize these. The finalization would include i) to make them robust and stable, probably by adding more tests (which will happen here), and if necessary, followed by ii) to adapt the implementations such as docstrings and tests to the current torch/audio (which will happen in torch/audio).

Phase 2

Phase 2 is about more permanent and breaking changes. In this phase, we will

i) discuss our implementation and its philosophy/ names, docstrings, tests, and everything else with torch/audio maintainers and
ii) make PRs to torch/audio
- this could be separately done by different people -- which is what everyone is doing on open source :) partly for fair credit and responsibility assignments.
The scope of Phase 2 would be pretty large, such as..
- all the preprocessing layers and functionals,
- augmentation,
- audio loading,
- chunking,
- and anything else that would make sense to have.

After Phase 2

This has to be discussed, but one thing for sure, after Phase 2, we will remove those that were merged into torch/audio from -contrib. After then we can use them by importing a correct and updated version of torch/audio.

Action items

Let me dictate for a bit :^P I'll either leave comments about what should be done. Probably also who'd be the best person I'd like to ask to do -- who would be basically the same person who implemented it in the first place. Apparently, nothing is obligatory and anyone can join for anything. I mean.. it's open source software 😄

Some general notes for us

Ok guys, don't be too sad about some of these.

As long as pytorch is based on unittest, so is us.
(Discussion to be done about modules/buffers/parameters. Related: #19)
Pytorch is compatible with Python 2.7/3.5/3.6/3.7. I assume we should be the same.

Disclaimer: This is my summary after discussion with @cpuhrsch @jamarshon from torch/audio as well as our contributors on slack/github.

keunwoochoi commented 5 years ago

On the work for Phase 1 - #62 covers STFT, ComplexNorm, Spectrogram. They follow torch/audio docstring conventions (+ my criteria, which is to be extremely nice to beginners!). Variable names are mostly up-to-date w.r.t. #36 except those very few that are currently unclear.

@ksanjeevan, could you update StretchSpecTime following layers above? Regarding #54, we though need to finalize the name first.

ksanjeevan commented 5 years ago

@ksanjeevan, could you update StretchSpecTime following layers above? Regarding #54, we though need to finalize the name first.

Will do, hopefully tomorrow (latest Friday since I'm returning from re:mars)!

It was our very first motivation when we started this whole thing with me and @faroit :) Just to give a context, torch/audio is going to be improved in many ways and some of them are directly related to -contrib

Yay, glad to have been there.

keunwoochoi commented 5 years ago

Log - STFT, ComplexNorm, Spectrogram and StretchSpecTime are ready to be reviewed by oorch/audio maintainers.

cpuhrsch commented 5 years ago

@keunwoochoi - I'm sorry if I'm being obtuse here. Just for reference, which pull request are you mentioning here?

keunwoochoi commented 5 years ago

@cpuhrsch Hey, there was a bunch of them and now they're all at the master branch here. But as we've talked on slack channel meanwhile, @ksanjeevan will make a PR for the functionals and their tests to the official repo soon (wrt https://github.com/keunwoochoi/torchaudio-contrib/pull/66). So, you can safely ignore the updates here in this thread.

keunwoochoi commented 5 years ago

Phase 1 completed. Special thanks to @ksanjeevan! https://github.com/pytorch/audio/pull/131

keunwoochoi commented 5 years ago

Also, quite a bit of our work is being considered/included :)

cpuhrsch commented 5 years ago

So now that we've released 0.3.0, standardized many aspects of the library and merged Phase 1, should we start looking into Phase 2? :)

keunwoochoi commented 5 years ago

@cpuhrsch Yes, definitely. On a business trip now but will get back to plan further. Btw, congrats for the 0.3.0 release! I was glad to see it :)

keunwoochoi commented 5 years ago

Phase 2

In the comment above, I listed we could do things like..

all the preprocessing layers and functionals,
augmentation,
audio loading,
chunking,
and anything else that would make sense to have.

To do so, we probably first need to update our -contrib repo by..

importing the recent torch/audio
removing all the overlapping codes.

I am not 100% sure if all of these have to be done in -contrib repo. Many of our suggestions are now part of the codebase of official repo. At least some of the future works seem totally fine to be directly done there. What would be the best practice? At the same time, some might be easier to be drafted here. Hm, so..

Augmentation

We can be quick and dirty on designing and implementing a bunch of augmentation layers here. By seeing the real code draft (real draft 🙄), I assume we can see the big picture about how we should design the API. Guess @ksanjeevan would be interested in this. I'm down, too.

Audio loading

@faroit Hey, any opinion? :)

Chunking

Maybe it's related to the loading itself. @faroit @f0k

And..

What else are we missing? What would be the low-hanging fruits?

keunwoochoi commented 5 years ago

a neat implementation of pre-emphasis and de-emphasis (https://github.com/AppleHolic/pytorch_sound/blob/master/pytorch_sound/models/sound.py)

keunwoochoi / torchaudio-contrib

Merging plan to torch/audio - Phase 1 and 2 #61

Phase 1

Phase 2

After Phase 2

Action items

Some general notes for us

Phase 2

Augmentation

Audio loading

Chunking

And..