k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
925 stars 297 forks source link

`cut_id` does not match original `utt_id` #620

Closed wgb14 closed 11 months ago

wgb14 commented 2 years ago

From https://github.com/k2-fsa/icefall/pull/522#issuecomment-1207366968, in decoding script, we tried to introduce utt_id into recognition results to better compare: https://github.com/k2-fsa/icefall/blob/a66e74b92f2f41855e0422420dfb06ebe7c6889f/egs/librispeech/ASR/pruned_transducer_stateless5/decode.py#L567 But actually cut_id is not the original utt_id in dataset, especially after cut_set.trim_to_supervisions(), cut_id becomes random value. In my experiments it should be

utt_ids = [cut.supervisions[0].id for cut in batch["supervisions"]["cut"]]
pzelasko commented 2 years ago

Good point, maybe we should change trim to supervisions behavior in Lhotse to adopt the supervision ID instead. Could you make a PR?

wgb14 commented 2 years ago

Good point, maybe we should change trim to supervisions behavior in Lhotse to adopt the supervision ID instead. Could you make a PR?

Agreed. I'll open a PR for this.

But this doesn't work for recipes like librispeech since it does not do cut_set.trim_to_supervisions()