resemble-ai / Resemblyzer

A python package to analyze and compare voices with deep learning
Apache License 2.0
2.67k stars 419 forks source link

Embedding is mostly zero #15

Closed dodobyte closed 4 years ago

dodobyte commented 4 years ago

I plot the embedding vector and it's mostly zeros. Is this expected?

I want to use the embedding in another project. I also plotted their example embeddings and those seem to be distributed significantly better.

And here's the test code;

from pathlib import Path
from resemblyzer import VoiceEncoder, preprocess_wav
import numpy as np, matplotlib.pyplot as plt

wav = preprocess_wav(Path("367-130732-0005.flac"))

encoder = VoiceEncoder()
embed = encoder.embed_utterance(wav)

plt.plot(embed, 'bo')
plt.show()

Am I doing something wrong? Thanks.

CorentinJ commented 4 years ago

Absolutely, the embeddings are sparse due to the relu at the end of the model. It doesn't make them worse, although I did remove that relu in the development branch I'm working on. Don't worry about it.