Closed zahrakhanjani128 closed 6 months ago
Hi, files like PA_E_1035160.flac and PA_E_1018196.flac are replayed and recorded with different devices and in different rooms. Hence, the speech contents in the two files are the same, but they are not exactly the same. You can check the waveform values.
Meta data can be found in https://github.com/asvspoof-challenge/2021#evaluation-tools-using-the-full-set-of-keys-and-meta-labels
For the two files above:
...
PA_0022 PA_E_1035160 R6 M1 d4 r4 m1 s4 c4 spoof notrim hidden
...
PA_0022 PA_E_1018196 R2 M2 d4 r2 m2 s3 c3 spoof notrim hidden
Many thanks for the clarification
Hello, we figured out some samples look duplicated in the dataset. We wonder if they are completely the copied version of each other? For example, PA_E_1035160.flac and PA_E_1018196.flac, and more ... Have you used a type of oversampling? Any specific reasons behind these duplicates? Thank you!