nii-yamagishilab / PartialSpoof

BSD 3-Clause "New" or "Revised" License
27 stars 2 forks source link

Absolute time label of fake span #2

Closed ductuantruong closed 8 months ago

ductuantruong commented 9 months ago

Hi,

Thank you for sharing your amazing work. I am trying to use your work in my research. I have taken a look at your data, the fake labels are at the segment level. May I ask whether you have the absolute time label of the fake span (e.g. from 1.2 seconds to 2.8 seconds) in each utterance?

Thank you for your support!

zlin0 commented 8 months ago

@ductuantruong Sorry for the late reply. I've been in an extremely busy period recently >< Are you referring to the database_vad.tar.gz? After uncompressing, you will see:

===./database/vad/{train,dev,eval} ===
This vad folder contains the timestamp annotation for each set. 
For each <uttid>.vad file, the format of each line is: 
<start_time> <end_time> <label>

<start_time> and <end_time> are in second. 
<label> includes: '0' for spoof, '1' for bona fide, and '2' for non-speech

Please let me know if this answers your question or if there is anything specific annotation you need. Thanks!

ductuantruong commented 8 months ago

Thank you for your reply and sharing this amazing work! No worry about the late response. I have found the timestamp label with your guide.

zlin0 commented 7 months ago

@ductuantruong Hi, I just realized that you asked about this issue with timestamp annotations. To avoid any confusion, I want to clarify that I have provided two types of timestamp annotations, depending on how I categorize nonspeech region. There are three types of nonspeech to consider: (i) nonspeech from bona fide sources, (ii) nonspeech from spoofed sources, and (iii) the concatenated part (the nonspeech used for overlap-add).

Based on this, the annotations are as follows:

  1. spoof, bona fide: In this type, (and all my currently published papers), I treat (i) as bona fide, while (ii) and (iii) as spoof. database_segment_labels_v1.2.tar.gz PS_data.tar.gz

  2. spoof, bonafide, nonspeech (nonspeech includes i, ii, iii) database_vad.tar.gz