Closed mohsen-goodarzi closed 4 months ago
Look into the following CutSet
methods: trim_to_supervisions
, trim_to_supervision_groups
, trim_to_alignments
For extending the cut there is extend
, on a related note there is also CutSet.merge_supervisions
See if any of these help with your case
Thanks for fast reply.
This is the way I did it:
desired_cut_len = 5
tmp_cuts = long_cuts.cut_into_windows(desired_cut_len).merge_supervisions().filter_supervisions(lambda s: s.start >= 0.0).trim_to_supervisions()
short_cuts = []
for cut in tmp_cuts :
if cut.duration < cut.supervisions[0].duration:
cut.duration = cut.supervisions[0].duration
short_cuts.append(cut)
There is a extend_by
method in CutSet
, but it didn't help me because in my case (i.e. extending cuts to their supervisions) the amount of extension is different for each single cut.
Anyway, the above snippet did the job for me.
I have a dataset that contains long audio files (around 1 hour). I also have the word transcription with time alignments and I treated every word as a supervision segment. I want to cut utterances into shorter segment of a specific length (like 5 sec). I don't care if the resulting cuts are not exactly 5 sec, as long as the cut position is not in the middle of a word. How can I do it with Lhotse?
If I do the
cuts.cut_into_windows(5, keep_excessive_supervisions=True)
, then some supervision segments duplicate on both adjacent cuts. If I do thecuts.cut_into_windows(5, keep_excessive_supervisions=False)
, then those supervision segments will be lost! The only solution I came up with is to do the first option and then loop over cuts and extend their duration to their supervision and also filter out supervisions with negative start. Is it the best solution? Is there a built in method to extend the cut to cover its supervisions?Any kind of help is appreciated.