accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.49k stars 1.99k forks source link

Audio Not Syncing with WebCams #1753

Open mkillebr opened 5 years ago

mkillebr commented 5 years ago

What would you like to submit? (put an 'x' inside the bracket that applies)

We are trying to capture video and audio from various webcams. The audio is always out of sync. I realize we are probably missing something, but I can't figure out what it is. We use the timestamp to record the video frames correctly, but when we do the same with the audio frames, it is only called once.


private async void StartRecording(VideoCaptureDevice device)
        {

            var tempFileName = {TEMP LOC};

            var videoFramesPerSecond = device.VideoResolution.MaximumFrameRate;
            var videoFrameRate = _videoCaptureDeviceHasVariableRate ? videoFramesPerSecond * 2 : videoFramesPerSecond;
            var videoBitRate = ((device.VideoResolution.FrameSize.Width * device.VideoResolution.FrameSize.Height) * videoFramesPerSecond);

            videoFileWriter = new VideoFileWriter
            {
                Width = device.VideoResolution.FrameSize.Width,
                Height = device.VideoResolution.FrameSize.Height,
                FrameRate = videoFrameRate,
                VideoCodec = VideoCodec.H264,
                BitRate = videoBitRate,
                AudioCodec = AudioCodec.Aac,
                AudioBitRate = 320000,
                SampleRate = 44100,
                AudioLayout = AudioLayout.Mono,
                FrameSize = 4096
            };

            videoFileWriter.VideoOptions["x264opts"] = "no-mbtree:sliced-threads:sync-lookahead=0";
            videoFileWriter.VideoOptions["preset"] = "ultrafast";
            videoFileWriter.VideoOptions["tune"] = "zerolatency";

            StartAudioCapture(_audioSourceSelected);

            videoFileWriter.Open(_tempFilePath);
        }
private void Video_NewFrame(object sender, Accord.Video.NewFrameEventArgs eventArgs)
        {
            DateTime currentFrameTime = eventArgs.CaptureFinished;
            var frameImage = eventArgs.Frame;

            var duration = DateTime.UtcNow - this.lastFrameReceived;
            double currentFrameRate = 0d;
            if (duration.TotalSeconds > 0.0)
            {
                currentFrameRate = 1.0 / duration.TotalSeconds;

                if (this.frameCounts.Count == 10)
                {
                    this.frameCounts.RemoveAt(0);
                }
                this.frameCounts.Add(currentFrameRate);
            }

            this.lastFrameReceived = DateTime.UtcNow;

            if (IsRecording && !_isStoppingRecording)
            {
                // Write to disk
                lock (locker)
                {
                    //if (var != null &&
                    //    this.currentImageRotation != RotateFlipType.RotateNoneFlipNone)
                    //{
                    //    // Make sure the image is the correct rotation.
                    //    var.RotateFlip(this.currentImageRotation);
                    //}

                    TimeSpan timestamp = currentFrameTime - recordingStartTime;
                    //videoFileWriter?.WriteVideoFrame((Bitmap)frameImage.Clone(), timestamp);
                    videoFileWriter?.WriteVideoFrame(frameImage, timestamp);
                }
            }
        }
private void StartAudioCapture(AudioDeviceInfo audioDeviceInfo)
        {
            StopAudioCapture();

            _audioCaptureDevice = new AudioCaptureDevice(audioDeviceInfo)
            {
                DesiredFrameSize = 4096,
                SampleRate = 44100,
                Format = SampleFormat.Format16Bit
            };

            _audioCaptureDevice.NewFrame += Audio_NewFrame;

            _audioCaptureDevice.Start();
            _audioCaptureDevice.AudioSourceError += Audio__AudioSourceError;
        }

        private void Audio__AudioSourceError(object sender, AudioSourceErrorEventArgs e)
        {
            // throw new NotImplementedException();
        }

        private void StopAudioCapture()
        {
            if (_audioCaptureDevice != null && _audioCaptureDevice.IsRunning)
            {
                _audioCaptureDevice.Stop();
                var b = _audioCaptureDevice.IsRunning;
            }
        }

        private void Audio_NewFrame(object sender, Accord.Audio.NewFrameEventArgs eventArgs)
        {
            var currentFrameTime = DateTime.Now;
            if (IsRecording && videoFileWriter.IsOpen && !_isStoppingRecording)
            {
                lock (locker)
                {
                    var timestamp = currentFrameTime - recordingStartTime;
                    videoFileWriter?.WriteAudioFrame(eventArgs.Signal, timestamp);
                    //videoFileWriter?.WriteAudioFrame(eventArgs.Signal);
                }
            }
        }
mkillebr commented 5 years ago

Code.txt

This might be a little easier to read. The formatting did not go through with the code above.

mkillebr commented 5 years ago

Here is some of the output, I wonder if the problem is the audio frame is being written at the wrong time:

Output #0, mp4, to '7e04f785-ba9c-4e1b-b3a3-e6e31201a99a.mp4': Metadata: encoder : Lavf57.56.100 Stream #0:0: Video: h264, 1 reference frame ([33][0][0][0] / 0x0021), yuv420p, 1600x896 (0x0), q=2-31, 43008 kb/s, 15360 tbn Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, delay 1024, 64 kb/s [mp4 @ 000002714b14aa40] Application provided invalid, non monotonically increasing dts to muxer in stream 1: 26084 >= 26084 [aac @ 000002714b144580] Queue input is backward in time [libx264 @ 000002714b149a80] frame I:13 Avg QP:16.92 size:228475 [libx264 @ 000002714b149a80] frame P:143 Avg QP:19.90 size: 82810 [libx264 @ 000002714b149a80] mb I I16..4: 100.0% 0.0% 0.0% [libx264 @ 000002714b149a80] mb P I16..4: 10.9% 0.0% 0.0% P16..4: 86.5% 0.0% 0.0% 0.0% 0.0% skip: 2.6% [libx264 @ 000002714b149a80] final ratefactor: 15.52 [libx264 @ 000002714b149a80] coded y,uvDC,uvAC intra: 68.4% 99.8% 97.9% inter: 63.6% 89.8% 64.1% [libx264 @ 000002714b149a80] i16 v,h,dc,p: 22% 27% 39% 12% [libx264 @ 000002714b149a80] i8c dc,h,v,p: 60% 16% 14% 10% [libx264 @ 000002714b149a80] kb/s:45575.38 [aac @ 000002714b144580] Qavg: 175.208 [aac @ 000002714b144580] 2 frames left in the queue on closing