Provide example for inference in Python

manojpamk / pytorch_xvectors

Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196

MIT License

304 stars 65 forks source link

Hi and many thanks for this nice work. I'm trying to integrate this code into my project in Python to obtain embeddings from a given WAV file. From the source files I can easily get how you apply the network and get the embeddings. However, the nnet3 egs format that it's being read needs to be computed by kaldi... is there an option to preprocess the file with a pure python library? Could you document the exact shape of the MFCCs that the models expects? That way I may implement the feature extraction with librosa or another similar tool

Thank you in advance

manojpamk / pytorch_xvectors

Provide example for inference in Python #7