Closed C-ra-zy-97 closed 1 year ago
That is quite strange. I have not used the callhome1_spk2 set but with callhome1 I obtained the attached files at this step in the recipe You can perhaps use them and check what you obtain. Besides, I have not calculated the overlap ratio for none of the sets but rather the percentage of time in the recordings when there is overlap, as stated in Table 1. Plus, I used callhome1 (all amounts of speakers, not only 2). Still, I would expect that the overlap ratios between the real and estimated sets are similar.
diff_spk_overlap.txt diff_spk_pause.txt diff_spk_pause_vs_overlap.txt newspk_samespk_pause_distribution_overlap_distribution.txt overlaps_info.txt same_spk_pause.txt
Thanks for the very quick reply. There are indeed some difference. I have attached related files in the follow, can you help me to debug it? If you have no time, can you send me the rttm file for callhome? Then I can check if there is some difference. callhome1_spkall_rttm.txt diff_spk_overlap.txt diff_spk_pause_vs_overlap.txt diff_spk_pause.txt newspk_samespk_pause_distribution_overlap_distribution.txt overlaps_info.txt same_spk_pause.txt
I am afraid I cannot share the rttm because it does not have a free license :/ You could run the whole data generation pipeline using these statistics but I understand that takes time. My recommendation is to plot your and my files as distributions so that you can get an idea if they differ too much.
Actually, the CALLHOME dataset is not public but the rttm file is public and can be downloaded from http://www.openslr.org/resources/10/sre2000-key.tar.gz. Besides, I have provided my callhome1 rttm file in the above. Can you help me to check it?
Sorry, I did not notice you had shared the rttms in the previous message. I have run a diff between the rttms I used and these. Besides trailing 0's, the differences in timings are only a few segments because of rounding (see attached picture) and this line
SPEAKER iait 1 350.72 0.00 <NA> <NA> iait_A <NA>
was in your file but it has zero length so it should not matter.
Overall, the rttms are the same
The distribution:
That is quite strange. I have not used the callhome1_spk2 set but with callhome1 I obtained the attached files at this step in the recipe You can perhaps use them and check what you obtain. Besides, I have not calculated the overlap ratio for none of the sets but rather the percentage of time in the recordings when there is overlap, as stated in Table 1. Plus, I used callhome1 (all amounts of speakers, not only 2). Still, I would expect that the overlap ratios between the real and estimated sets are similar.
diff_spk_overlap.txt diff_spk_pause.txt diff_spk_pause_vs_overlap.txt newspk_samespk_pause_distribution_overlap_distribution.txt overlaps_info.txt same_spk_pause.txt
By the way, can you share the code that you used to calculate the overlap ratio in the paper?
As I said, it is NOT overlap ratio but the percentage of overlap over the length of the file. The code is here: https://github.com/BUTSpeechFIT/diarization_utils/blob/main/compute_stats.py You would need to sum the percentages for the categories 2, 3 and 4-or-more simultaneous speakers to get the "overlap" I report in the table.
I hope this helps
Thank you so much. I'm using your statistics to generate data. I'll let you know the results once I'm done generating them
When I use your statistics, the overlap ratio for simulation data is correct. Pretty strange.
So basically the difference is between using Callhome part 1 or Callhome part 1 only 2 speaker files, correct? I do not remember having analyzed the statistics about those two sets but maybe they differ substantially (even though one is subset of the other).
Sorry for the conclusion last day. Today, I repeated all experiments from scratch. Actually, the simulation datasets generated from call1_spkall and call1_spk2 statistics are similar. Today, I deleted all intermediate results and regenerated all the files, and the final result is correct. Last day, I met some errors in specific steps and I ran it many times. There may be some problems in this process. Anyway, thank you very much for your prompt reply. Great work, thanks for contributing code!!!!
That is quite strange. I have not used the callhome1_spk2 set but with callhome1 I obtained the attached files at this step in the recipe You can perhaps use them and check what you obtain. Besides, I have not calculated the overlap ratio for none of the sets but rather the percentage of time in the recordings when there is overlap, as stated in Table 1. Plus, I used callhome1 (all amounts of speakers, not only 2). Still, I would expect that the overlap ratios between the real and estimated sets are similar.
diff_spk_overlap.txt diff_spk_pause.txt diff_spk_pause_vs_overlap.txt newspk_samespk_pause_distribution_overlap_distribution.txt overlaps_info.txt same_spk_pause.txt
Hi fnlandini, why there are negative pauses in the same_spk_pause.txt ? It is weird. And I noticed that the code allows negative intervals between the same speaker.(https://github.com/BUTSpeechFIT/EEND_dataprep/files/10909638/same_spk_pause.txt)
Hi @someonefighting Yes, there should not be negative pauses there. I suspect there could be some error in the annotations or the code but it would need a more careful analysis. I might be able to analyze it at some point but not right now. The code of conv_generator could be updated to discard negative values, you are right.
Anyway, thanks for your great job!
Recently, I have used the v1 recipe to generate simulation data from sre & switchboard. The callhome1_spk2 is used to estimate statistics (overlap ratio: 13.529%). However, after the simulation, I found the overlap ratio of the simulation dataset is 20.134 %. How should debug this problem?