google-deepmind / dmvr

Apache License 2.0
65 stars 16 forks source link

Nan values when extracting audio using ffmpeg. #11

Closed AlvinKimata closed 1 year ago

AlvinKimata commented 1 year ago

I'm extracting raw audio from a .mp4 file. I am storing it as np-array float32. I am getting nan, and extremely low and high values.

On this when I try to get melspectrogram then it gives an error:

TypeError: array([nan, nan, nan, ..., nan, nan, nan], dtype=float32) has type numpy.ndarray, but expected one of: int, long, float

Below is the code snippet used for extracting audio using ffmpeg.

cmd = (
      ffmpeg
      .input(video_path, ss=start, t=end-start)
      .output("pipe:", ac=1, ar=sampling_rate, format="s32le")
  )

  audio, _ = cmd.run(capture_stdout=True, quiet=True)

  audio = np.frombuffer(audio, np.float32) "

What could be the issue?