mrousavy / react-native-vision-camera

📸 A powerful, high-performance React Native Camera library.
https://react-native-vision-camera.com
MIT License
6.72k stars 1k forks source link

🐛 iOS pause audio/video output incorrect #2790

Open xHeinrich opened 1 month ago

xHeinrich commented 1 month ago

What's happening?

When taking a video and pausing/resuming the video on iOS, the audio and video are chopped into different segments rather than one continuous video. For example, I start recording and say

one two three

pause recording and say four five six

resume recording and say seven eight nine

stop recording

Example video output:

https://github.com/mrousavy/react-native-vision-camera/assets/7674587/c50c74ab-23be-404e-a81f-22128227475a

Reproduceable Code

Full repo with a minimal reproduction: https://github.com/xHeinrich/vision-camera-reproduction

const camera = useRef<Camera>(null)

    let device = useCameraDevice('back', {
        physicalDevices: [
            'ultra-wide-angle-camera',
            'wide-angle-camera',
            'telephoto-camera'
        ]
    })

    const [targetFps, setTargetFps] = useState(30)

    let filters = [
        {fps: targetFps},
        {videoStabilizationMode: 'auto'},
        {
            videoResolution: {
                width: 1280,
                height: 720
            },
        },
        {
            photoResolution: {
                width: 1280,
                height: 720
            },
        }
    ];

    const format = useCameraFormat(device, filters)
    const [torchOn, setTorchOn] = useState('off')
    const onError = useCallback((error: any) => {
        console.error(error)
    }, [])

                   <Camera device={device}
                                zoom={device.neutralZoom}
                                ref={camera}
                                format={format}
                                enableZoomGesture={true}
                                exposure={0}
                                style={StyleSheet.absoluteFill}
                                isActive={isActive}
                                torch={torchOn}
                                orientation={'portrait'}
                                audio={micPermissionState.hasPermission}
                                photo={permissionState.hasPermission}
                                video={permissionState.hasPermission}
                                onError={onError}
                        />

            camera.current!.takePhoto({
                qualityPrioritization: 'speed',
                enableShutterSound: false
            }).then((file: PhotoFile) => {
            }).catch((error) => {
                reject(error)
            })

// sleep 3 seconds

        camera.current.startRecording({
            flash: 'on',
            fileType: "mp4",
            videoCodec: "h264",
            videoBitRate: 5, // 5mbps as target, affected by target fps
            onRecordingFinished: (video: VideoFile) => {
            },
            onRecordingError: (error) => console.error(error)
        })

// sleep 3 seconds
        await camera.current.pauseRecording()

// sleep 3 seconds

        await camera.current.resumeRecording()

// sleep 3 seconds

       await camera.current.stopRecording()

Relevant log output

13:23:26.080: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 27 props: [onInitialized, cameraId, position, enableBufferCompression, preview, onStarted, onCodeScanned, collapsable, top, right, isActive, video, onViewReady, onError, onStopped, enableFrameProcessor, format, orientation, left, bottom, audio, enableZoomGesture, exposure, torch, photo, onShutter, zoom]
13:23:26.081: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:23:26.082: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: true, outputsChanged: true, videoStabilizationChanged: true, orientationChanged: true, formatChanged: true, sidePropsChanged: true, torchChanged: true, zoomChanged: true, exposureChanged: true, audioSessionChanged: true, locationChanged: true)
13:23:26.082: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Configuring Input Device...
13:23:26.082: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Configuring Camera com.apple.avfoundation.avcapturedevice.built-in_video:7...
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Successfully configured Input Device!
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Configuring Outputs...
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Adding Photo output...
13:23:26.088: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Adding Video Data output...
13:23:26.088: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Successfully configured all outputs!
13:23:26.089: [info] :camera_with_flash: VisionCamera.configureFormat(configuration:device:): Configuring Format (2112x1188 | 1280x720@60.0 (ISO: 34.0..3264.0))...
13:23:26.089: [info] :camera_with_flash: VisionCamera.configureFormat(configuration:device:): Successfully configured Format!
13:23:26.090: [info] :camera_with_flash: VisionCamera.getPixelFormat(for:): Available Pixel Formats: ["420v", "420f", "BGRA", "&8v0", "-8v0", "&8f0", "-8f0", "&BGA", "-BGA"], finding best match... (pixelFormat="yuv", enableHdr={false}, enableBufferCompression={true})
13:23:26.090: [info] :camera_with_flash: VisionCamera.getPixelFormat(for:): Using PixelFormat: -8f0...
13:23:26.485: [info] :camera_with_flash: VisionCamera.onCameraStarted(): Camera started!
13:23:26.485: [info] :camera_with_flash: VisionCamera.onSessionInitialized(): Camera initialized!
13:23:26.486: [info] :camera_with_flash: VisionCamera.configure(_:): Beginning AudioSession configuration...
13:23:26.486: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Configuring Audio Session...
13:23:26.486: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Adding Audio input...
13:23:26.487: [info] :camera_with_flash: VisionCamera.configure(_:): Beginning Location Output configuration...
13:23:26.490: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Adding Audio Data output...
13:23:26.491: [info] :camera_with_flash: VisionCamera.configure(_:): Committed AudioSession configuration!
13:23:26.495: [info] :camera_with_flash: VisionCamera.configure(_:): Finished Location Output configuration!
13:23:44.113: [info] :camera_with_flash: VisionCamera.takePhoto(options:promise:): Capturing photo...
13:23:47.175: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Starting Video recording...
13:23:47.177: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Will record to temporary file: /private/var/mobile/Containers/Data/Application/0F78673F-DB67-4F14-8017-D356A019D118/tmp/ReactNative/01121C78-1A01-4C95-9FFF-38CC93C40AF1.mp4
13:23:47.186: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Enabling Audio for Recording...
13:23:47.186: [info] :camera_with_flash: VisionCamera.activateAudioSession(): Activating Audio Session...
13:23:47.194: [info] :camera_with_flash: VisionCamera.initializeAudioWriter(withSettings:format:): Initializing Audio AssetWriter with settings: ["AVSampleRateKey": 48000, "AVNumberOfChannelsKey": 1, "AVFormatIDKey": 1633772320]
13:23:47.194: [info] :camera_with_flash: VisionCamera.updateCategory(_:mode:options:): Changing AVAudioSession category from AVAudioSessionCategoryPlayAndRecord -> AVAudioSessionCategoryPlayAndRecord
13:23:47.357: [info] :camera_with_flash: VisionCamera.updateCategory(_:mode:options:): AVAudioSession category changed!
13:23:47.362: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 1 props: [torch]
13:23:47.362: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:23:47.773: [info] :camera_with_flash: VisionCamera.activateAudioSession(): Audio Session activated!
13:23:47.785: [info] :camera_with_flash: VisionCamera.initializeAudioWriter(withSettings:format:): Initialized Audio AssetWriter.
13:23:47.795: [info] :camera_with_flash: VisionCamera.recommendedVideoSettings(forOptions:): Using codec AVVideoCodecType(_rawValue: avc1)...
13:23:47.795: [info] :camera_with_flash: VisionCamera.recommendedVideoSettings(forOptions:): Setting Video Bit-Rate from 14358528.0 bps to 5000000.0 bps...
13:23:47.795: [info] :camera_with_flash: VisionCamera.initializeVideoWriter(withSettings:): Initializing Video AssetWriter with settings: ["AVVideoCompressionPropertiesKey": ["AverageNonDroppableFrameRate": 30, "Priority": 80, "RealTime": 1, "ExpectedFrameRate": 60, "MaxAllowedFrameQP": 41, "H264EntropyMode": CABAC, "MaxKeyFrameIntervalDuration": 1, "AverageBitRate": 5000000, "AllowFrameReordering": 0, "MinAllowedFrameQP": 15, "QuantizationScalingMatrixPreset": 3, "ProfileLevel": H264_High_AutoLevel], "AVVideoWidthKey": 720, "AVVideoHeightKey": 1280, "AVVideoCodecKey": avc1]
13:23:47.819: [info] :camera_with_flash: VisionCamera.initializeVideoWriter(withSettings:): Initialized Video AssetWriter.
13:23:47.819: [info] :camera_with_flash: VisionCamera.start(clock:): Starting Asset Writer(s)...
13:23:48.441: [info] :camera_with_flash: VisionCamera.start(clock:): Asset Writer(s) started!
13:23:48.442: [info] :camera_with_flash: VisionCamera.start(clock:): Started RecordingSession at time: 127794.095554541
13:23:48.442: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): RecordingSesssion started in 1266.5345ms!
13:23:48.443: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: false, outputsChanged: false, videoStabilizationChanged: false, orientationChanged: false, formatChanged: false, sidePropsChanged: false, torchChanged: true, zoomChanged: false, exposureChanged: false, audioSessionChanged: false, locationChanged: false)
13:24:00.905: [info] :camera_with_flash: VisionCamera.stop(clock:): Requesting stop at 127806.559768166 seconds for AssetWriter with status "writing"...
13:24:00.988: [info] :camera_with_flash: VisionCamera.appendBuffer(_:clock:type:): Successfully appended last audio Buffer (at 127806.56072916667 seconds), finishing RecordingSession...
13:24:00.988: [info] :camera_with_flash: VisionCamera.finish(): Stopping AssetWriter with status "writing"...
13:24:01.008: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): RecordingSession finished with status completed.
13:24:01.008: [info] :camera_with_flash: VisionCamera.deactivateAudioSession(): Deactivating Audio Session...
13:24:01.013: [info] :camera_with_flash: VisionCamera.deactivateAudioSession(): Audio Session deactivated!
13:24:01.034: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 1 props: [torch]
13:24:01.034: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:24:01.034: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: false, outputsChanged: false, videoStabilizationChanged: false, orientationChanged: false, formatChanged: false, sidePropsChanged: false, torchChanged: true, zoomChanged: false, exposureChanged: false, audioSessionChanged: false, locationChanged: false)

Camera Device

{
 "id": "com.apple.avfoundation.avcapturedevice.built-in_video:7",
 "formats": [],
 "hasFlash": true,
 "name": "Back Triple Camera",
 "minExposure": -8,
 "neutralZoom": 2,
 "physicalDevices": [
  "ultra-wide-angle-camera",
  "wide-angle-camera",
  "telephoto-camera"
 ],
 "supportsFocus": true,
 "supportsRawCapture": false,
 "isMultiCam": true,
 "minZoom": 1,
 "minFocusDistance": 2,
 "maxZoom": 61.875,
 "maxExposure": 8,
 "supportsLowLightBoost": false,
 "sensorOrientation": "landscape-right",
 "position": "back",
 "hardwareLevel": "full",
 "hasTorch": true
}

Device

iPhone 13 Pro

VisionCamera Version

4.0.0

Can you reproduce this issue in the VisionCamera Example app?

Yes, I can reproduce the same issue in the Example app here

Additional information

thanhtungkhtn commented 3 weeks ago

Any update on this? facing similar issue

"react": "18.2.0",
"react-native": "0.74.1",
"react-native-vision-camera": "^4.0.3"
xHeinrich commented 3 weeks ago

Think it must be some timing issue with the asset writer but I don't know enough swift to find actual issue.

qper228 commented 3 days ago

same here

lee-byeoksan commented 1 day ago

This is my naive workaround patch for 3.9.2. Only tested with iPhone 12 Pro. I'm not an expert on iOS and swift. Even though we pause recording, captureSession's clock keeps going (we cannot stop captureSession because we should show the camera preview to the users). It seems that AVAssetWriter only considers the timestamp recorded in CMSampleBuffers. The idea is to adjust timestamp in the buffer.

This is a demo video same as the author did.

https://github.com/mrousavy/react-native-vision-camera/assets/14037793/13a6d5dc-4422-4b30-abe7-2d84040b5ef5

diff --git a/ios/Core/CameraSession+Video.swift b/ios/Core/CameraSession+Video.swift
index 00ff941b1d4cee15323f1f960a19a14613acab01..69e57e4092d99104793b994e9273a37dd301c18f 100644
--- a/ios/Core/CameraSession+Video.swift
+++ b/ios/Core/CameraSession+Video.swift
@@ -157,11 +157,12 @@ extension CameraSession {
   func pauseRecording(promise: Promise) {
     CameraQueues.cameraQueue.async {
       withPromise(promise) {
-        guard self.recordingSession != nil else {
+        guard let recordingSession = self.recordingSession else {
           // there's no active recording!
           throw CameraError.capture(.noRecordingInProgress)
         }
         self.isRecording = false
+        try recordingSession.pause(clock: self.captureSession.clock)
         return nil
       }
     }
@@ -173,11 +174,12 @@ extension CameraSession {
   func resumeRecording(promise: Promise) {
     CameraQueues.cameraQueue.async {
       withPromise(promise) {
-        guard self.recordingSession != nil else {
+        guard let recordingSession = self.recordingSession else {
           // there's no active recording!
           throw CameraError.capture(.noRecordingInProgress)
         }
         self.isRecording = true
+        try recordingSession.resume(clock: self.captureSession.clock)
         return nil
       }
     }
diff --git a/ios/Core/RecordingSession.swift b/ios/Core/RecordingSession.swift
index 85e9c622573143bd38f0b0ab6f81ad2f40e03cc3..8c4836c97b562bbda362c14f314a0ce96f113d2a 100644
--- a/ios/Core/RecordingSession.swift
+++ b/ios/Core/RecordingSession.swift
@@ -33,6 +33,8 @@ class RecordingSession {

   private var startTimestamp: CMTime?
   private var stopTimestamp: CMTime?
+  private var pauseTimestamp: CMTime?
+  private var pauseTimestampOffset: CMTime?

   private var lastWrittenTimestamp: CMTime?

@@ -67,7 +69,12 @@ class RecordingSession {
           let startTimestamp = startTimestamp else {
       return 0.0
     }
-    return (lastWrittenTimestamp - startTimestamp).seconds
+
+    if let pauseTimestampOffset = pauseTimestampOffset {
+      return (lastWrittenTimestamp - startTimestamp - pauseTimestampOffset).seconds
+    } else {
+      return (lastWrittenTimestamp - startTimestamp).seconds
+    }
   }

   init(url: URL,
@@ -158,6 +165,8 @@ class RecordingSession {
     // Start the sesssion at the given time. Frames with earlier timestamps (e.g. late frames) will be dropped.
     assetWriter.startSession(atSourceTime: currentTime)
     startTimestamp = currentTime
+    pauseTimestamp = nil
+    pauseTimestampOffset = nil
     ReactLogger.log(level: .info, message: "Started RecordingSession at time: \(currentTime.seconds)")

     if audioWriter == nil {
@@ -195,6 +204,56 @@ class RecordingSession {
     }
   }

+  /**
+   Record pause timestamp to calculate timestamp offset using the current time of the provided synchronization clock.
+   The clock must be the same one that was passed to start() method.
+   */
+  func pause(clock: CMClock) throws {
+    lock.wait()
+    defer {
+      lock.signal()
+    }
+
+    let currentTime = CMClockGetTime(clock)
+    ReactLogger.log(level: .info, message: "Pausing Asset Writer(s)...")
+
+    guard pauseTimestamp == nil else {
+      ReactLogger.log(level: .error, message: "pauseTimestamp is already non-nil")
+      return
+    }
+
+    pauseTimestamp = currentTime
+  }
+
+  /**
+   Update pause timestamp offset using the current time of the provided synchronization clock.
+   The clock must be the same one that was passed to start() method.
+   */
+  func resume(clock: CMClock) throws {
+    lock.wait()
+    defer {
+      lock.signal()
+    }
+
+    let currentTime = CMClockGetTime(clock)
+    ReactLogger.log(level: .info, message: "Resuming Asset Writer(s)...")
+
+    guard let pauseTimestamp = pauseTimestamp else {
+      ReactLogger.log(level: .error, message: "Tried resume but recording has not been paused")
+      return
+    }
+
+    let pauseOffset = currentTime - pauseTimestamp
+    self.pauseTimestamp = nil
+    if let currentPauseTimestampOffset = pauseTimestampOffset {
+      pauseTimestampOffset = currentPauseTimestampOffset + pauseOffset
+      ReactLogger.log(level: .info, message: "Current pause offset is \(pauseTimestampOffset!.seconds)")
+    } else {
+      pauseTimestampOffset = pauseOffset
+      ReactLogger.log(level: .info, message: "Current pause offset is \(pauseTimestampOffset!.seconds)")
+    }
+  }
+
   /**
    Appends a new CMSampleBuffer to the Asset Writer.
    - Use clock to specify the CMClock instance this CMSampleBuffer uses for relative time
@@ -238,12 +297,32 @@ class RecordingSession {
     }

     // 3. Actually write the Buffer to the AssetWriter
+    let buf: CMSampleBuffer
+    if let pauseTimestampOffset = pauseTimestampOffset {
+      // let newTime = timestamp - pauseTimestampOffset
+      var count: CMItemCount = 0
+      CMSampleBufferGetSampleTimingInfoArray(buffer, entryCount: 0, arrayToFill: nil, entriesNeededOut: &count)
+      var info = [CMSampleTimingInfo](repeating: CMSampleTimingInfo(duration: CMTimeMake(value: 0, timescale: 0), presentationTimeStamp: CMTimeMake(value: 0, timescale: 0), decodeTimeStamp: CMTimeMake(value: 0, timescale: 0)), count: count)
+      CMSampleBufferGetSampleTimingInfoArray(buffer, entryCount: count, arrayToFill: &info, entriesNeededOut: &count)
+
+      for i in 0..<count {
+        info[i].decodeTimeStamp = info[i].decodeTimeStamp - pauseTimestampOffset
+        info[i].presentationTimeStamp = info[i].presentationTimeStamp - pauseTimestampOffset
+      }
+
+      var out: CMSampleBuffer?
+      CMSampleBufferCreateCopyWithNewTiming(allocator: nil, sampleBuffer: buffer, sampleTimingEntryCount: count, sampleTimingArray: &info, sampleBufferOut: &out)
+      buf = out!
+    } else {
+      buf = buffer
+    }
     let writer = getAssetWriter(forType: bufferType)
     guard writer.isReadyForMoreMediaData else {
       ReactLogger.log(level: .warning, message: "\(bufferType) AssetWriter is not ready for more data, dropping this Frame...")
       return
     }
-    writer.append(buffer)
+    writer.append(buf)
+    ReactLogger.log(level: .info, message: "append \(bufferType) Buffer (at \(timestamp.seconds) seconds)...")
     lastWrittenTimestamp = timestamp

     // 4. If we failed to write the frames, stop the Recording

My concerns on this workaround are:

  1. Because the latest pause and resume timestamp are considered, there can be some race condition due to out-of-order buffer processing (I guess it is rare)
  2. The only way to change the timestamp of the buffer I found is to copy it and I am not sure how much performance would be affected.
lee-byeoksan commented 1 day ago

Found relevant PR but it's closed. https://github.com/mrousavy/react-native-vision-camera/pull/1546