lipku / python_rtmpstream

python库,实现推送实时rtmp音视频流
MIT License
68 stars 20 forks source link

读取wav文件并进行推流,在Windows上用VLC进行播放存在电流音 #18

Closed JiadiLee closed 3 months ago

JiadiLee commented 3 months ago

感谢作者开放代码!我在linux下通过soundfile 读取一个非16000采样率的wav文件,通过resampy 进行采样率转换后,每个frame取640 的长度,以25fps的形式进行推流,在Windows 上用VLC拉流播放,背景噪声很大,有明显的电流音,可能是什么原因造成的呢?代码如下:

import resampy
import soundfile as sf

stream, sample_rate = sf.read("./output.wav")
    stream = stream.astype(np.float32)
    if stream.ndim > 1:
        print('[WARN] audio has ', stream.shape[1], ' channels, only use the first.')
        stream = stream[:, 0]
    if sample_rate != 16000:
        print('[WARN] audio sample rate is: ', sample_rate, ', resampling into 16000')
        wav = resampy.resample(x=stream, sr_orig=sample_rate, sr_new=16000)
...    
if (start_idx) + 640 > wav.shape[0]:
    audio_frame = (wav[start_idx:wav.shape[0]])
    start_idx = 0
else:
    audio_frame = (wav[start_idx:start_idx + 640])
    start_idx = start_idx + 640
streamer.stream_frame_audio(audio_frame)
JiadiLee commented 3 months ago

将音频流改为50fps,在推一个视频帧的同时,通过for _ in range(2) 推两个音频帧,这样输出的语音比较正常。

jinqinn commented 2 months ago

@JiadiLee 方便交流下吗? 我也遇到一样的问题

JiadiLee commented 2 months ago

@JiadiLee 方便交流下吗? 我也遇到一样的问题

下面是我推流的代码片段,我的视频是25fps,音频采样率是16000,音频是通过librosa读取的,在编码1帧图片的同时编码2帧音频,所以音频的chunk要设置为320,即16000 / 25 /2。

if not ori_frames.empty():
    frame = ori_frames.get()
    streamer.stream_frame(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    for _ in range(2):
        ori_wav = np.zeros(320)
        streamer.stream_frame_audio(ori_wav)