wav file: Unknown frame size for input audio format: pcm_s16le

skeydan commented 1 year ago

Hi,

I am currently evaluating the use of av as the main backend for https://github.com/mlverse/torchaudio. Unfortunately, I get this error reading a .wav file (http://www.physics.uio.no/pow/wavbirds/chaffinch.wav):

w <- av::read_audio_bin(wav)
Unknown frame size for input audio format: pcm_s16le

ffmpeg itself does not seem to have problems with the file, and neither have sox or tuneR. av_media_info says:

$duration
[1] 42.28

$video
NULL

$audio
  channels sample_rate     codec frames bitrate     layout
1        2       44100 pcm_s16le     NA 1411200 2 channels

I'm not an audio expert, and I am confused by what exactly frame size is in ffmpeg (see e.g. the answer here: https://stackoverflow.com/questions/55113331/ffmpeg-confused-with-the-concept-of-audio-frame-size-and-its-calculation, which honestly I don't quite understand) ...

I also found frame size equated with pkt_size (https://www.reddit.com/r/ffmpeg/comments/hnoxo1/frame_size_in_bytes/), and when I run ffprobe -show_frames, I see that the very last frame is shorter than all others:

ffprobe -show_frames chaffinch.wav | grep pkt_size

pkt_size=4096
pkt_size=4096
pkt_size=3472

Maybe this could explain the error I get?

JPalmerK commented 1 year ago

I'm having the same issue. I'm using it to create spectrograms but need to remove the dc offset first so I can't skip directly to the read_audio_fft. It still loads the data but I'm concerned whether its doing so correctly... :/

jeroen commented 1 year ago

I think this message is ignorable. Does your code work otherwise?

Looking at the results, this is only used internally to determine buffer size. In case the file doesn't include the frame size, it falls back on a safe guess internally, but this should not affect results

We can actually verify if results are correct: the package includes an internal old (slower) implementation of the same function, that should not display this error. So you can verify that the results are the same:

w1 <- av::read_audio_bin(wav)
w2 <- av:::read_audio_bin_old(wav)
all.equal(w1, w2)

FYI the old version looks like this:

https://github.com/ropensci/av/blob/4ca3e6edec096b763222e496144d10e48902c3e5/R/fft.R#L80-L94

skeydan commented 1 year ago

Thanks, you're right! It does work correctly.

ropensci / av

wav file: Unknown frame size for input audio format: pcm_s16le #47