Issue with Audio Segmentation When Video Contains Long Periods of Silence

Shadows97 commented 5 months ago

There is an issue with the current audio segmentation approach in PySubtitle when processing videos that contain long periods of silence. The split_on_silence function from pydub may not handle these long silent segments effectively, leading to incomplete or inaccurate transcription and subtitle generation.

Steps to Reproduce:

Use a video file that contains long periods of silence (e.g., 5-10 seconds or more).
Run the audio_to_text function to convert the audio to text.
Observe the generated VTT file and note that the transcription may stop prematurely or miss segments of the video.

Expected Behavior:

The audio segmentation should handle long periods of silence more effectively, ensuring that the entire video is processed and transcribed accurately.

Actual Behavior: The transcription process may stop prematurely or miss segments of the video when long periods of silence are encountered.

Possible Solution:

Adjust the parameters of the split_on_silence function to better handle long periods of silence.
Implement a custom segmentation approach that can detect and handle long silent segments more effectively.
Consider adding a fallback mechanism to ensure that the entire video is processed, even if long silent segments are present.
Additional Context:

This issue affects the accuracy and completeness of the generated subtitles, especially for videos with significant silent segments. Improving the segmentation approach will enhance the overall reliability of PySubtitle.

hcm444 commented 5 months ago

We can implement a custom segmentation method that detects and splits the audio based on a threshold duration of silence rather than relying solely on split_on_silence from pydub. So something like

def custom_split_on_silence(sound, min_silence_len=500, silence_thresh=-40)

Shadows97 commented 5 months ago

possible

Shadows97 / PySubtitle

Issue with Audio Segmentation When Video Contains Long Periods of Silence #4