pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.29k stars 777 forks source link

'SpeechActivityDetection' object has no attribute '_binarize' #385

Closed SamChen closed 4 years ago

SamChen commented 4 years ago

I am working on load a tuned pipeline, which I following the offered tutorial, through following code:

sad_scores="exp/base_models/sad/train/senior_exp.SpeakerDiarization.spk_diarization.train/validate_detection_fscore/senior_exp.SpeakerDiarization.spk_diarization.development/"

scd_scores="exp/base_models/scd/train/senior_exp.SpeakerDiarization.spk_diarization.train/validate_segmentation_fscore/senior_exp.SpeakerDiarization.spk_diarization.development/"

embedding="exp/base_models/emb/train/senior_exp.SpeakerDiarization.spk_diarization.train/validate_diarization_fscore/senior_exp.SpeakerDiarization.spk_diarization.development/"

method = "affinity_propagation"

pipeline = speaker_diarization.SpeakerDiarization(sad_scores = sad_scores,
                                                  scd_scores = scd_scores,
                                                   embedding = embedding,
                                                   method=method)

I already fine tuned sad, scd and emb through following the fine-tuning tutorial.

However, when I run it on a test audio, I get following error message:

AttributeError                            Traceback (most recent call last)
<ipython-input-16-0a6fc04a84e9> in <module>
----> 1 diarization = pipeline(test_file)

~/miniconda3/envs/pytorch/lib/python3.7/site-packages/pyannote/audio/pipeline/speaker_diarization.py in __call__(self, current_file)
    152 
    153         # segmentation into speech turns
--> 154         speech_turns = self.speech_turn_segmentation(current_file)
    155 
    156         # some files are only partially annotated and therefore one cannot

~/miniconda3/envs/pytorch/lib/python3.7/site-packages/pyannote/audio/pipeline/speech_turn_segmentation.py in __call__(self, current_file)
    121 
    122         # speech regions
--> 123         sad = self.speech_activity_detection(current_file).get_timeline()
    124 
    125         scd = self.speaker_change_detection(current_file)

~/miniconda3/envs/pytorch/lib/python3.7/site-packages/pyannote/audio/pipeline/speech_activity_detection.py in __call__(self, current_file)
    157             speech_prob = SlidingWindowFeature(data, sad_scores.sliding_window)
    158 
--> 159         speech = self._binarize.apply(speech_prob)
    160 
    161         speech.uri = current_file.get("uri", None)

~/miniconda3/envs/pytorch/lib/python3.7/site-packages/pyannote/pipeline/pipeline.py in __getattr__(self, name)
     86         msg = "'{}' object has no attribute '{}'".format(
     87             type(self).__name__, name)
---> 88         raise AttributeError(msg)
     89 
     90     def __setattr__(self, name, value):

AttributeError: 'SpeechActivityDetection' object has no attribute '_binarize' 

The code than I ran is:

test_file = {'uri': spkid, "annotation": "temp/temp.rttm", 'audio': selected_df.iloc[0].audio_path}
diarization = pipeline(test_file)

Can someone help me solve the problem?

marlon-br commented 4 years ago

Hi, were you able to address the issue?

SamChen commented 4 years ago

@marlon-br I cannot solve this. I have switched to use Kaldi instead.

hbredin commented 4 years ago

Somehow I missed this issue...

Pipelines have internal hyper-parameters (such as detection or clustering thresholds) that needs to be tuned and instantiated before they can be used.

See this tutorial explaining how to tune and apply the diarization pipeline.

If you want to do it through the API, I recommend using the load_params method

pipeline = SpeakerDiarization(...)
pipeline.load_params('/path/to/params.yml')
diarization = pipeline(test_file)
hbredin commented 4 years ago

Closing this issue. Please re-open if needed.