Open natalietoulova opened 3 years ago
Hey @natalietoulova!
I don't have an example .csv file that was used for accent probes, but if you look at the AccentProbe/data_loader.py
file the csv file needs to contain file_name, accent_label, duration
. Here I think my folder organization is such that I have stored representations (after different layers of the network) for each audio file, and these representation files are stored in a folder indicating the type of representation (eg. probe_data/lstm_0/[file name].npy
). Here, data_path = probe_data
, rep_type = lstm_0
and the file name comes from the csv file. The other thing that you will need is a meta file, which I have used in several experiments that aligns phones in speech to time durations in the audio files. This basically corresponds to end_times
used in data_loader.py
file. If you are only running this experiment, the entire alignment is not needed, you can simply do a voice activity detection to mark the time the speech starts and the time it ends. The code basically needs end_times
to process out the silence as it may unnecessarily increase the data-size loaded into the accent classifiers. All this being said, you can always make your own custom data_loader.py
that works for your set-up.
Regarding the hyperparameters, I will have to refer you to the paper for details. We did not find accent classification trends to be very hyper-parameter sensitive, so we picked learning_rate = 1e-03
and batch_size = 16/32
depending on available GPU space. Hope that answers your questions.
Best, Archiki
Hi, I am new to programming and I am a little bit confused about which CSV file is used in AccentProbe and also which hyperparameters are used for training, did I miss it somewhere?