RussellSB commented 3 years ago

Overview

Added evaluate.py - purpose to run on a test set without weight updates, and save out.pickle of conversions when done
Added plotting functionality to both evaluate.py and train.py
Updated Encoder model definition, removed final_block (shared_block) param
Updated preprocessing to be with defaultdict(list). Just for less verbosity, and consistency with what I did in evaluate.py for feats dictionary

Plotting Notes

In train.py it plots the first batch every epoch or few epochs (specified by plot_interval arg)
In evaluate.py plot_interval arg controls how often it plots a batch (with respect to every few batches, since no epochs in evaluate)
The code looks kind of repeated for plotting in utils.py. But figured its more readable this way as opposed to having a single method with lots of conditions. Also, need to separate eval from train plot functions, since train is also concerned with plotting the target used in updating weight - while eval isn't.

Train Plot

Eval Plot

RussellSB commented 3 years ago

Train Discussion

Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.

RussellSB commented 3 years ago

Preprocessing Discussion

In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py.

Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.

So for preprocessing maybe:

Read number spkr dirs in dataset
assert num_dirs == 2, 'Currently only two speakers accepted'

Would also need to reflect such changes in data_proc.py

ebadawy commented 3 years ago

Preprocessing Discussion

In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py.

Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.

This seems to be a good idea; however, what do you think about having more than two speakers? we will need to use something else other than A B naming convention in that case. I will leave changing the naming to you if you want to do it or not.

So for preprocessing maybe:

Read number spkr dirs in dataset

assert num_dirs == 2, 'Currently only two speakers accepted'

I like this assertion, we need to have it

Would also need to reflect such changes in data_proc.py

ebadawy commented 3 years ago

Train Discussion

Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.

having a separated evaluate.py seems to be a bit redundant since it uses a lot of the training code. what it be better to include that in train.py and have additional argument to how frequent the plots are being generated?

RussellSB commented 3 years ago

Train Discussion

Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.

having a separated evaluate.py seems to be a bit redundant since it uses a lot of the training code. what it be better to include that in train.py and have additional argument to how frequent the plots are being generated?

I agree that evaluate.py does introduce code repetition but I still think it is useful for concise results from unseen sets of data. Inference.py on the other hand, would be concerned with the full functionality specific to a full audio file.

Evaluate serves as a lightweight version of train.py that isn't concerned with updating weights. And separating it from train.py, rather than integrating its logic with conditionals in there I think makes things more reader-friendly for newcomers to the codebase.

If we were to integrate this to train.py it isn't just plotting that will be reconsidered but we'd need to have an option that runs a pre-trained model on eval mode (so it doesn't update gradients and is more efficient), and also not save the models and limit the epochs. Which again, isn't too hard to do - but I think will mess up clarity in train.py, which other researchers would especially appreciate for making use of/extending from it.

Also I've incorporated a plotting functionality for train.py already. It will output plots to 'src/out_train' whereas evaluate.py outputs plots to 'src/out_eval'

RussellSB commented 3 years ago

Preprocessing Discussion

In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py. Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.

This seems to be a good idea; however, what do you think about having more than two speakers? we will need to use something else other than A B naming convention in that case. I will leave changing the naming to you if you want to do it or not.

So for preprocessing maybe:

Read number spkr dirs in dataset

assert num_dirs == 2, 'Currently only two speakers accepted'

I like this assertion, we need to have it

Would also need to reflect such changes in data_proc.py

Ok cool re naming conversion and naming them wrt wav dir names Ill keep this in mind, but Ill prioritise it for after test/eval/train split and inference.py. Also added the assert in preprocessing.py now

ebadawy / voice_conversion

Evaluate and Transfer Plotting #3

Overview

Plotting Notes

Train Plot

Eval Plot

Train Discussion

Preprocessing Discussion

Preprocessing Discussion

Train Discussion

Train Discussion

Preprocessing Discussion