Open Aijohc opened 2 months ago
Regarding the first issue it looks like I haven't updated CutPairsSampler
properly with latest changes. I'll take a look.
Regarding the other question, You might want to use DynamicCutSampler
or DynamicBucketingSampler
instead; if you give them more than one CutSet, they act as CutPairsSampler
(and support triples, quadruples, and so on as well). In fact CutPairsSampler
should be deprecated at this point.
Thank you for your answer! So, does that mean DynamicCutSampler can completely replace CutPairsSampler? I will give it a try. Thanks!
Yes, it can.
Hello, and thank you for the excellent work with Lhotse's data management features!
I encountered a bug when using
CutPairsSampler
. When I load my source_cuts and target_cuts usingCutSet.from_files()
(with a list of.jsonl.gz
files), the expectedStopIteration
exception is not raised correctly at the end of the dataset iteration. Instead, I encounter a different error:CutParisSampler
Cuts
lhotse version: 1.26.0
I believe this could be an issue with how the end of the dataset is handled when iterating over
CutPairsSampler
. Could you please investigate this?Thanks again for your hard work!
Additional Question:
I also have a question regarding the
CutPairsSampler
. Is it possible to specify parameters likebuffer_size
andquadratic_duration
similar to theDynamicBucketingSampler
? These parameters are very important when working with theDynamicBucketingSampler
, and I noticed they are not directly available inCutPairsSampler
. Could you consider supporting such parameters?Thank you!