读取wav文件并进行推流，在Windows上用VLC进行播放存在电流音

JiadiLee commented 3 months ago

感谢作者开放代码！我在linux下通过soundfile 读取一个非16000采样率的wav文件，通过resampy 进行采样率转换后，每个frame取640 的长度，以25fps的形式进行推流，在Windows 上用VLC拉流播放，背景噪声很大，有明显的电流音，可能是什么原因造成的呢？代码如下：

import resampy
import soundfile as sf

stream, sample_rate = sf.read("./output.wav")
    stream = stream.astype(np.float32)
    if stream.ndim > 1:
        print('[WARN] audio has ', stream.shape[1], ' channels, only use the first.')
        stream = stream[:, 0]
    if sample_rate != 16000:
        print('[WARN] audio sample rate is: ', sample_rate, ', resampling into 16000')
        wav = resampy.resample(x=stream, sr_orig=sample_rate, sr_new=16000)
...    
if (start_idx) + 640 > wav.shape[0]:
    audio_frame = (wav[start_idx:wav.shape[0]])
    start_idx = 0
else:
    audio_frame = (wav[start_idx:start_idx + 640])
    start_idx = start_idx + 640
streamer.stream_frame_audio(audio_frame)

JiadiLee commented 3 months ago

将音频流改为50fps，在推一个视频帧的同时，通过for _ in range(2) 推两个音频帧，这样输出的语音比较正常。

jinqinn commented 2 months ago

@JiadiLee 方便交流下吗？我也遇到一样的问题

JiadiLee commented 2 months ago

@JiadiLee 方便交流下吗？我也遇到一样的问题

下面是我推流的代码片段，我的视频是25fps，音频采样率是16000，音频是通过librosa读取的，在编码1帧图片的同时编码2帧音频，所以音频的chunk要设置为320，即16000 / 25 /2。

if not ori_frames.empty():
    frame = ori_frames.get()
    streamer.stream_frame(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    for _ in range(2):
        ori_wav = np.zeros(320)
        streamer.stream_frame_audio(ori_wav)

lipku / python_rtmpstream

读取wav文件并进行推流，在Windows上用VLC进行播放存在电流音 #18