How to get time stamps for every speaker change?

For the purpose of evaluating speaker diarization, I am trying to integrate DER evaluation script which requires time stamps of each active speaker speaking from start_time to end_time (as per RTTM format).

As per my experiments: For an audio of 60 seconds, I get 75 similarities (depending on the number of partial utterances (also called wav splits)), but there is an overlap between each of the partial utterances. Say 2 partial utterances in overlap have different speaker labels (one has speaker_A and other has speaker_B (from the similarity matrix)), how do we then get the exact time stamp (in seconds) where the speaker change has occurred?

Kindly request any help :)

resemble-ai / Resemblyzer

How to get time stamps for every speaker change? #77