Fix for small segments - Githubissues

Pranjalya commented 7 months ago

Patch

Fix for small segments, when the audio duration is less than max_seg_len
Fallback for generate_segment_batched in case the seq_len and seq_metadata is not provided

BBC-Esq commented 6 months ago

I like it!

Sembiance commented 5 months ago

Great fix, without it WhisperS2T is useless for small duration audio.

HIGHLY recommend merging this pull request :)

shashikg commented 4 months ago

Hi @Pranjalya @Sembiance ! Can you describe here or link an issue related to small duration audio?

Pranjalya commented 2 months ago

Hey @shashikg, the issue was in the loop where we segment audio into parts and the case where the original audio's duration is < 1s. Using the range function and setting the end timestamp as int(audio_duration) will lead it to it being 0, which when used on range returns an empty list. Using a math.ceil function ensures that it is rounded up to the next ceiling integer and the audio segment timestamp is logged. This bug is potentially dangerous as well if someone is using indexing to map the audio segments, as it leads to missing of the parts.

andriken commented 1 week ago

what will "max_seg_len" do?

shashikg / WhisperS2T

Fix for small segments #57