Closed picheny-nyu closed 3 years ago
This might be a duplicate of #20.
If not, can you please provide a minimal reproducible example?
Yes, it is the same issue. I tried to fix it by adding the lines:
# hypothesis processing
filled = hypothesis
coverage = filled.support()
between the lines
reference_partition = self._partition(filled, coverage)
hypothesis_partition = self._partition(hypothesis, coverage)
in the SegmentationCoverage class. I am not sure that is right but at least it gave me better looking values for the calculation of purity........if I am wrong can you explain why?
I find it difficult to help you without a minimal reproducible example.
Can you please provide one?
reference = Annotation()
reference[...] = ...
...
hypothesis = Annotation()
hypothesis[...] = ...
...
print(purity(reference, hypothesis))
print(coverage(reference, hypothesis))
I will, sorry - just been busy with a couple other things last few days., should be able to provide this by Monday.
I have similar problem here and it would be great if you could help fix this:
Here is an example:
ref = Annotation() ref[Segment(0,3)] = 'A'
hyp = Annotation() hyp[Segment(2,4)] = 'a'
purity = SegmentationPurity() coverage = SegmentationCoverage()
print(coverage(ref, hyp) # returns 1.0 but I expected 0.33 print(purity(ref, hyp)) # returns 1.0 but I expected 0.5
Thanks, Cathy
Did you have a look at issue #20?
There is a whole discussion there trying to explain the behavior of those metrics.
Thanks for your quick reply! Now I understand that the segmentation purity and coverage, can be applied to full partitions of the file. I'm still kind of confuse about the purity and coverage calculation.
Let's take this for example: reference = Annotation() reference[Segment(0, 3)] = 'A' reference[Segment(5, 7)] = 'B'
hypothesis = Annotation() hypothesis[Segment(2, 4)] = 'a' hypothesis[Segment(4, 7)] = 'b'
diarizationPurity = DiarizationPurity() diarizationCoverage = DiarizationCoverage() print(diarizationPurity(reference, hypothesis)) print(diarizationCoverage(reference, hypothesis))
I would expect the purity to be 3/5 = 0.6, where 3 is the intersection between ref and hyp, and 5 is the total speech duration in hyp. And coverage to be 3/5 = 0.6, where 3 is the intersection and 5 is the total speech duration in ref. But it returns 1.0 for both purity and coverage. Are these expected?
Did you have a look at issue #20?
There is a whole discussion there trying to explain the behavior of those metrics.
The way diarization purity and coverage are implemented in pyannote.metrics make them only focus on the "speech" regions common to both reference and hypothesis.
Therefore, it starts by removing the following regions from the evaluation...
... and then only compute purity and coverage.
The main motivation is to not mix speech detection errors (for which pyannote.metrics.detection metrics should be used) and speaker confusion errors. I agree that there are other ways to compute purity and coverage and I'd likely consider a PR adding these alternative implementations to pyannote.metrics.
Got it. Thanks for your reply! Great project👍
The way diarization purity and coverage are implemented in pyannote.metrics make them only focus on the "speech" regions common to both reference and hypothesis.
Therefore, it starts by removing the following regions from the evaluation...
- [0 --> 2] because there is no speech in hypothesis
- [3 --> 5] because there is no speech in reference
... and then only compute purity and coverage.
The main motivation is to not mix speech detection errors (for which pyannote.metrics.detection metrics should be used) and speaker confusion errors. I agree that there are other ways to compute purity and coverage and I'd likely consider a PR adding these alternative implementations to pyannote.metrics.
Description
I have a situation in which the entire hypothesis is being returned as a single segment. This seems to result in both a purity of 1.0 and coverage of 1.0, which is not right. If I understand the code correctly, what seems to be happening is that in segmentation.py, when the method _partition(self, timeline, coverage) is executed, "coverage" is basically the reference labelling, so if timeline is a single segment, when "return partition.crop(coverage, mode='intersection').relabel_tracks()" is called, it crops the timeline to exactly the reference segmentation, resulting in a purity of 1.0 and a coverage of 1.0. Maybe my understanding is faulty, but I can really use some help here.
Thanks Michael Picheny
Steps/Code to Reproduce
Expected Results
Actual Results
Versions