Open sanena opened 2 years ago
Sure, here is the sample code
from sklearn.manifold import TSNE
# the variable spk is the list of corresponding spk label for the embeddings, for example: ['p363-syn', 'p363-ori', 'p363-syn', ...]
# with respect to the order in the variable embeds
# create a spk2int mapping dictionary
mapping = dict([(v,str(i)) for i,v in enumerate(list(set(spk)))])
# the variable embeds below is the list of embeddings from the synthesis utterances, while org_embeds are from genuine utterances
tsne_embed_mix = TSNE(n_components=2).fit_transform(np.concatenate([embeds, org_embeds]))
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
for i in mapping.keys():
indexes = np.where(np.array(spk) == i)[0]
# since we just concatenate the embeds and the org_embeds and those two variables share the same order of spk labels
# the org_indexes for spk i can be obtained as follow
org_indexes = indexes + len(spk)
plt.xlim(np.min(tsne_embed_mix[:, 0]) - 25, np.max(tsne_embed_mix[:, 0]))
ax.scatter(tsne_embed_mix[indexes, 0], tsne_embed_mix[indexes, 1], c='C' + mapping[i], s=15, label = i, marker='x')
ax.scatter(tsne_embed_mix[org_indexes, 0], tsne_embed_mix[org_indexes, 1], c='C' + mapping[i], s=15, label = i, marker='^')
ax.legend()
ax.grid()
#plt.show()
Hi, could you tell me how to achieve the speaker embedding visualization by t-SNE?