Closed RussellSB closed 3 years ago
Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.
In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py.
Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.
So for preprocessing maybe:
Would also need to reflect such changes in data_proc.py
Preprocessing Discussion
In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py.
Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.
This seems to be a good idea; however, what do you think about having more than two speakers? we will need to use something else other than A B naming convention in that case. I will leave changing the naming to you if you want to do it or not.
So for preprocessing maybe:
- Read number spkr dirs in dataset
- assert num_dirs == 2, 'Currently only two speakers accepted'
I like this assertion, we need to have it
Would also need to reflect such changes in data_proc.py
Train Discussion
Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.
having a separated evaluate.py
seems to be a bit redundant since it uses a lot of the training code. what it be better to include that in train.py
and have additional argument to how frequent the plots are being generated?
Train Discussion
Is it worth mirroring how evaluate.py saves a feats dict of converted output but also in train.py (on its last epoch)? I'm unsure of this. My reservations lie in that if the user wants to get the converted output of one epoch of training batches, they can use evaluate.py on the train set. Only reservation is out.pickle would be saved in out_eval not out_train.
having a separated
evaluate.py
seems to be a bit redundant since it uses a lot of the training code. what it be better to include that intrain.py
and have additional argument to how frequent the plots are being generated?
I agree that evaluate.py does introduce code repetition but I still think it is useful for concise results from unseen sets of data. Inference.py on the other hand, would be concerned with the full functionality specific to a full audio file.
Evaluate serves as a lightweight version of train.py that isn't concerned with updating weights. And separating it from train.py, rather than integrating its logic with conditionals in there I think makes things more reader-friendly for newcomers to the codebase.
If we were to integrate this to train.py it isn't just plotting that will be reconsidered but we'd need to have an option that runs a pre-trained model on eval mode (so it doesn't update gradients and is more efficient), and also not save the models and limit the epochs. Which again, isn't too hard to do - but I think will mess up clarity in train.py, which other researchers would especially appreciate for making use of/extending from it.
Also I've incorporated a plotting functionality for train.py already. It will output plots to 'src/out_train' whereas evaluate.py outputs plots to 'src/out_eval'
Preprocessing Discussion
In evaluate.py for the feats dict, I named the keys with respect to their conversion A2B, B2A. Seems to be a convention in style transfer code to use 'A2B' notation. Should we consider maybe changing the convention of how spkr directories are named in the database? Instead, if we were to name them spkr_A and spkr_B it could be more consistent. And we can iterate over key strings, instead of key int values in preprocessing.py. Not only would this make things more consistent with "A2B notation" but say if the user were to name directories "spkr_4" or "spkr_7" (corresponding to ids from flickr_audio), it would string sort and still take spkr_4 as A and spkr_7 as B. We can then have n_spkrs controlled by the number of speaker directories in the dataset. Though for now, we'll note that it only works with two for the time being.
This seems to be a good idea; however, what do you think about having more than two speakers? we will need to use something else other than A B naming convention in that case. I will leave changing the naming to you if you want to do it or not.
So for preprocessing maybe:
- Read number spkr dirs in dataset
- assert num_dirs == 2, 'Currently only two speakers accepted'
I like this assertion, we need to have it
Would also need to reflect such changes in data_proc.py
Ok cool re naming conversion and naming them wrt wav dir names Ill keep this in mind, but Ill prioritise it for after test/eval/train split and inference.py. Also added the assert in preprocessing.py now
Overview
Plotting Notes
Train Plot
Eval Plot