Streampunk / beamcoder

Node.js native bindings to FFmpeg.
GNU General Public License v3.0
397 stars 76 forks source link

AAC encoding issues #45

Closed ccantill closed 4 years ago

ccantill commented 4 years ago

I'm trying to write a script that takes any video file with a h264 stream, and creates an mp4 file transcoding the audio to AAC (using an aformat and aresample filter to convert the frames to the correct format first) and just copying the H264 stream into the MP4 container. I'm running into some issues however with the AAC encoding. For context, this is what I'm currently working witht: https://github.com/pith-media-server/ffmiddleware/blob/master/src/demo/transcodeManually.ts

First issue is that setting the profile to 'LC' in the encoder properties, I'll immediately get a SIGSERV upon creating the encoder. Setting it to 1 instead (the value of FF_PROFILE_AAC_LOW in avcodec.h) it doesn't crash. However, it doesn't seem to be actually set, because if I inspect it afterwards by opening it with the demuxer, the profile is set to -1:

      "codecpar": {
        "type": "CodecParameters",
        "codec_type": "audio",
        "codec_id": 86018,
        "name": "aac",
        "codec_tag": "mp4a",
        "format": "fltp",
        "bit_rate": 251238,
        "bits_per_coded_sample": 16,
        "profile": -1,
        "channel_layout": "stereo",
        "channels": 2,
        "sample_rate": 48000
      }

Additionally, neither the channel or channel_layout are properly stored (they were set to 8 channels and 7.1, same as the input), and playing the file back the audio is garbled and the timing of the video is all off (the video plays real fast, and the then the audio continues with blank video). The total duration of the video file is right though. What am I doing wrong?

scriptorian commented 4 years ago

Hi,

I'm guessing that your AAC encoding problem that is causing the garbles is that the encoder requires data to be submitted as (typically) 1024 samples. In the past I have written a 'dicer' buffer to achieve this. You will find an example in beamstreams.js (Line 28 onwards) if that is any help. As for the encoder creation crash, I have found that the AAC codec doesn't provide an array of supported profiles as normally expected and the code doesn't handle this correctly, hence the crash. I'll fix the crash in a future update but using the numeric value (1 in this case) avoids this problem and worked for me, giving the correct profile when read back from the encoder and the codecpar in the resulting muxer stream having used your creation code. I'm afraid you will also run into the problem pointed out in #35 - I haven't done the work to support the bitstream filters. Good luck!

ccantill commented 4 years ago

Thanks for the hint. That dicer helps a little bit. On one stream it at least renders an audio stream that somewhat resembles the original (but with distortion). On another stream it gives me the same problems I had when I tried to transcode to AAC using the pipeline config: it starts spewing [aac @ 0x6d6fc0] Queue input is backward in time and [aac @ 0x6d6fc0] Input contains (near) NaN/+-Inf after the first couple of frames.

So then I tried adding a asetnsamples=n=1024:p=1 filter instead, which works better; the resulting frames are all 1024 samples and there are no errors with either input stream. The sound is perfect this time, but the video is still sped up.

EDIT: I found the issue with the incorrect profile in codecpar when opening the resulting file. Copying the extradata property from the encoder to the codecpar in the stream fixes it.

scriptorian commented 4 years ago

After my comment yesterday I tried to use my scratch mp4 maker and found the same Nan/Inf problem. I eventually tracked it down to a line of code in the dicer that presumably is getting different data with the updated version of FFmpeg. I have fixed that now and it all seems to work as expected with clean audio. asetnsamples=n=1024:p=1 is a good choice to replace the hand rolled dicer - I hadn't found that filter when I was last playing here but I have used it successfully since. The crash at the end is annoying. I had a brief look at the code and apart from a cryptic comment next to it I can't get much idea of what is causing it to go wrong. I had a quick look at your latest code and wondered if it could be caused by you having missed an await on the recodeAndWrite call at the end so the flush might be getting in before the last encode.

ccantill commented 4 years ago

Yup, it's the await. Just found it too. Still looking into the video timing issue. I'm logging the pts for each packet upon writing and it seems to look fine, so I'm not sure yet what's going wrong.

ccantill commented 4 years ago

Found the issue with the timing as well now. Apparently the timebase is forcefully set to [1,16000] by the muxer. Compensating for that fixes it. Strange that the stream resulting from muxer.newStream doesn't reflect that new timebase correctly, but hardcoding it to [1,16000] works.

scriptorian commented 4 years ago

Great - I'm glad its all working now.

ccantill commented 4 years ago

Me too, thanks for the assistance!

felicemarra commented 3 years ago

I'm sorry.. I'm decoding aac frames from raw pcm and raw video frames. I receive the same error message "Input contains (near) NaN/+-Inf" I tried to make an example to reproduce the issue with a virgin frame of 1024 samples. If you try to run it you receive this error. Not always but often.. I don't understand what I'm doing wrong.. can you help me?

const beamcoder = require('beamcoder');

let encParamsAudio = { name: 'aac', time_base: [1, 48000], sample_fmt: 'fltp', sample_rate: 48000, bit_rate: 192000, channel_layout: 'stereo', channels: 2 }

async function run() { let encoderAudio = await beamcoder.encoder(encParamsAudio); for (var i = 0; i < 200; i++) { let destFrameAudio = beamcoder.frame({ channels: 2, nb_samples: 1024, format: 'fltp', channel_layout: 'stereo', sample_rate: 48000, pkt_size: 1024 4 2 }).alloc(); let packetsAudio = await encoderAudio.encode(destFrameAudio); } }

run();