inconsistency between code and technical report

In the report

3.2. Diarization pipeline To perform the diarization, each input recording is first split into speech segments according to the oracle VAD and the segments shorter than 0.1 s are discarded. From these segments, x-vectors are extracted every 0.25 s from overlapping sub-segments of 1.5 s (or less than 1.5 s for the last sub-segments or shorter segments). The x-vectors are centered, whitened and length normalized (Garcia-Romero and Espy-Wilson, 2011) (which is also done for the PLDA training data).

However in predict.py https://github.com/BUTSpeechFIT/VBx/blob/57466e6e245d5cdfe2e88ee6503702ace3ffdd03/VBx/predict.py#L168 i.e segments shorter than 0.01s are discarded

https://github.com/BUTSpeechFIT/VBx/blob/57466e6e245d5cdfe2e88ee6503702ace3ffdd03/VBx/predict.py#L89-L90 i.e. x-vectors are extracted every 0.24 s from overlapping sub-segments of 1.44s

BUTSpeechFIT / VBx

inconsistency between code and technical report #68