About the training data

Hi, I am a newbie, and i have two questions: 1) is the path of training data "SRC_PATH" in the generate_embeddings.py/, where the directory indicates the speaker_id 2) Currently, I use only the short dataset like Librispeech (less than 10s) for training. However, the paper uses two off-domain datasets for training: 2000 NIST Speaker Recognition Evaluation and ICSI Meeting Corpus, which are long speech datasets. I am wondering how to use them in the code. Thanks a lot!

taylorlu / Speaker-Diarization

About the training data #36