Open yuguochencuc opened 4 months ago
You can do that by composing some groupby's and append, e.g.
from lhotse.cut.set import CutSet, append_cuts
cuts = CutSet.from_file(...)
speakers = cuts.speakers
long_cuts_per_speaker = {s: append_cuts(cuts.filter(lambda c: c.supervisions[0].speaker == s)) for s in speakers} # there are more efficient ways of doing this
Of course you can also add silence in between with cut.pad(cut.duration + pause_duration).append(another_cut)
, mix noises in with .mix(noise_cut)
, etc.
Thank you very much for your reply! May I ask if I can control the length of each sentence? For example, if I need to merge several short sentence into 10-20S, will I follow cut.duration to merge.
---- Replied Message ---- From Piotr @.> Date 02/29/2024 23:54 To lhotse-speech/lhotse @.> Cc Guochen Yu @.>, Author @.> Subject Re: [lhotse-speech/lhotse] Function to merge short sentences into a long sentence (Issue #1293)
You can do that by composing some groupby's and append, e.g. from lhotse.cut.set import CutSet, append_cuts
cuts = CutSet.from_file(...) speakers = cuts.speakers
long_cuts_per_speaker = {s: append_cuts(cuts.filter(lambda c: c.supervisions[0].speaker == s)) for s in speakers} # there are more efficient ways of doing this Of course you can also add silence in between with cut.pad(cut.duration + pause_duration).append(another_cut), mix noises in with .mix(noise_cut), etc. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Yes, you can also search for CutConcatenate transform that now has support for max_duration argument.
Hello, I would like to ask if there is a function in lhotse that supports the operation of splicing multiple uniform speakers or id-like sentence in a CutSet into one long sentence? I have extracted jsonl.gz and I want to merge the sentences of some speakers with similar ids into one long sentence.