Open CamellIyquitous opened 1 month ago
您好,请问可以告诉我如何使用 wavlm_large_finetune.pth 该模型吗?
我按照类似于 wavlm 官方的加载方法无法使用该模型(以下代码为 wavlm 官方使用方法): from transformers import Wav2Vec2FeatureExtractor, WavLMForXVector from datasets import load_dataset import torch
dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation") feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('microsoft/wavlm-base-sv') model = WavLMForXVector.from_pretrained('microsoft/wavlm-base-sv')
inputs = feature_extractor(dataset[:2]["audio"]["array"], return_tensors="pt") embeddings = model(**inputs).embeddings embeddings = torch.nn.functional.normalize(embeddings, dim=-1).cpu()
cosine_sim = torch.nn.CosineSimilarity(dim=-1) similarity = cosine_sim(embeddings[0], embeddings[1]) threshold = 0.86 # the optimal threshold is dataset-dependent if similarity < threshold: print("Speakers are not the same!")
我将下载好的 wavlm_large_finetune.pth 存到本地,然后替换官方代码中的路径,发现无法加载这个模型。 请问如何正确加载使用该 wavlm_large_finetune.pth 模型呢,谢谢!
您好,请问可以告诉我如何使用 wavlm_large_finetune.pth 该模型吗?
我按照类似于 wavlm 官方的加载方法无法使用该模型(以下代码为 wavlm 官方使用方法): from transformers import Wav2Vec2FeatureExtractor, WavLMForXVector from datasets import load_dataset import torch
dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation") feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('microsoft/wavlm-base-sv') model = WavLMForXVector.from_pretrained('microsoft/wavlm-base-sv')
inputs = feature_extractor(dataset[:2]["audio"]["array"], return_tensors="pt") embeddings = model(**inputs).embeddings embeddings = torch.nn.functional.normalize(embeddings, dim=-1).cpu()
cosine_sim = torch.nn.CosineSimilarity(dim=-1) similarity = cosine_sim(embeddings[0], embeddings[1]) threshold = 0.86 # the optimal threshold is dataset-dependent if similarity < threshold: print("Speakers are not the same!")
我将下载好的 wavlm_large_finetune.pth 存到本地,然后替换官方代码中的路径,发现无法加载这个模型。 请问如何正确加载使用该 wavlm_large_finetune.pth 模型呢,谢谢!