SSAST for embedding model

Hi Yuan, We're reaching out again regarding our Bachelor Thesis on Speaker Recognition. We're facing a challenge in implementing the SSAST as an embedding model trained on Contrastive Loss, such as Triplet Loss. Since speaker recognition poses an open set problem, where the number of speaker classes isn't predetermined, we need to determine a suitable dimension for the embedding. Additionally, we consider to make adjustments to the multilayer perceptron (MLP) head to accommodate this. During your studies on SSAST do u came up with any insights that could maybe lead to any recommendations for us? Thanks, Andrin

YuanGongND / ssast

SSAST for embedding model #34