Closed xiangzai0115 closed 1 year ago
Sure ! Take any example from the repository that uses the Inference class (eg [1], [2]), and add the option skip_conversion=True
. For example:
from pyannote.audio import Inference, Model
model = Model.from_pretrained("powerset_pretrained.ckpt")
inf = Inference(model, step=2.5, skip_conversion=True)
result = inf(my_audio_file)
result
This would give you something like that as output
Note that this is the output of the LogSoftmax layer. To obtain "probabilities":
import numpy as np
result.data = np.exp(result.data)
result
Finally, if you want the logits, you can remove the LogSoftmax with
model.activation = torch.nn.Identity()
and run the Inference as usual, with skip_conversion=True
.
Cool, that's wonderful!!
Thanks!
I'm closing the issue for now, but please do reopen it if any details are missing or if you encounter problems.
Hi,
Thanks for this amazing work! Is there any way to get speaker posteriors from local EEND model?
Cheers, Xiang