google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.85k stars 5.09k forks source link

OBS Virtual Camera not working with the Source Picker provided by control-utils package #4478

Open AliceHincu opened 1 year ago

AliceHincu commented 1 year ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Windows 10, React 18

MediaPipe version

@mediapipe/control_utils: 0.6.1675466023 @mediapipe/drawing_utils@0.3.1675466124 @mediapipe/holistic_utils@0.5.1675471629

Bazel version

No response

Solution

Holistic

Programming Language and version

Typescript

Describe the actual behavior

I want to connect my phone camera to my laptop so I can use it as a webcamera. I converted this code to typescript (I will provide the typescript code below). The SourcePicker sees the two different cameras, but when I want to switch it to the OBS one, the source doesn't change. I did try the camera with Discord, and Discord outputted correctly the source, so it is not because of OBS. And for React, I have tried the Webcam component from react-webcam library, and it did work correctly (I will also provide the code for this). Since the SourcePicker is provided by Mediapipe, I don't know if the cause of not changing the sources is because of you or me.

Describe the expected behaviour

The video input doesn't change, it remains the front camera from the laptop

Standalone code/steps you may have used to try to get what you need

For the OBS setup to check that it works with discord (you have to start virtual cam), and for the phone you can install DroidCamOBS... this is the site for the plugin. Honestly, if you just have another camera, I think you can try it with that one, I don't think the OBS camera matters that much.

export const Component = () => {
  // references for video capturing
  const videoElement = useRef<HTMLVideoElement>(null);
  // const cameraElement = useRef<Webcam>(null); // for webcamera component
  const canvasElement = useRef<HTMLCanvasElement>(null);
  const controlsElement = useRef<HTMLDivElement>(null);

  const onResults = (results: any) => { console.log(results); }
  //  const [deviceId, setDeviceId] = useState(""); // for the webcamera component

 // ControlPanel source code... This is for the Source Picker:
 new SourcePicker({
              onSourceChanged: async (name: string) => {
                // Resets because the pose gives better results when reset between source changes.
                holistic.reset();

                // Request a MediaStream from the new camera.
                const stream = await navigator.mediaDevices.getUserMedia({
                  video: {
                    deviceId: { exact: name },
                  },
                });

                // Assign the MediaStream to the video element and start playing the video.
                if (videoElement.current && videoElement.current.srcObject) {
                  let oldStream = videoElement.current.srcObject as MediaStream;
                  oldStream.getTracks().forEach((track) => track.stop());
                  videoElement.current.srcObject = stream;
                  videoElement.current.play();
                }
                setDeviceId(name);
              },
              onFrame: async (input, size) => {
                const { width, height } = canvasDimensions(size);
                if (canvasElement.current) {
                  canvasElement.current.width = width;
                  canvasElement.current.height = height;
                }
                await holistic.send({ image: input });
              },
            }),
    // ...finish control panel code
     return (
    <div>
      <div className="container">
        <video ref={videoElement} className="input_video"></video>
        {/*<Webcam
          audio={false}
          ref={cameraElement}
          videoConstraints={{
            deviceId: deviceId ? { exact: deviceId } : undefined,
          }}
        ></Webcam> */}
        <div className="canvas-container">
          <canvas ref={canvasElement} className="output_canvas" width="1280px" height="720px">
            {" "}
          </canvas>
        </div>
        <Spinner loading={loading}></Spinner>
      </div>
      <div ref={controlsElement} className="control-panel"></div>
    </div>
  );
}

Other info / Complete Logs

No response

AliceHincu commented 1 year ago

Update: I am mistaken, the webcamera solution is not working either. It does change the video input that is shown to the user, but the "onFrame" function is still getting the front camera.

AliceHincu commented 1 year ago

@kuaashish @ayushgdev are there any updates on this?

AliceHincu commented 1 year ago

I wrote my own code eventually. This does not use the SourcePicker component, but it centers the video and the canvas on top of the video, and allows the user to switch between cameras. I will give all the components and the file hierarchy:

The Spinner component is for showing the loading spinner before the video is ready. CameraSelect is the dropdown from where the user chooses between available cameras. PoseEstimator is for getting the landmarks. VideoPlayer shows the video captured by the camera. VideoHandler has as children the VideoPlayer and the CameraSelect components. Canvas Handler has the canvas on top of the video, and canvas-utils.ts has a function for drawing on the canvas.

import React, { useState, useEffect, useRef, useCallback } from "react";
import { VideoHandler } from "./video/VideoHandler";
import { CanvasHandler } from "./canvas/CanvasHandler";

export const PostureProcessing = () => {
  const poseResultsRef = useRef<any>(null);
  const videoRef = useRef<HTMLVideoElement>(null);

  const onFrameResult = (results: any) => {
    poseResultsRef.current = results;
  };

  const getPoseResults = () => poseResultsRef.current;

  return (
    <div className="App">
      <div
        style={{
          backgroundColor: "#596e73",
          display: "flex",
          justifyContent: "center",
          alignItems: "center",
          padding: "0 10%", // Padding 10% on left and right
          height: "100vh",
          boxSizing: "border-box", // Include padding in height calculation
          position: "relative",
        }}
      >
        <VideoHandler onFrameResult={onFrameResult} videoRef={videoRef}></VideoHandler>
        <CanvasHandler getPoseResults={getPoseResults} videoRef={videoRef} />
      </div>
    </div>
  );
};
import React from "react";
import CameraSelect from "./CameraSelect";
import useCamera from "../../hooks/useCamera";
import { VideoPlayer } from "./VideoPlayer";

interface VideoHandlerProps {
  onFrameResult: (frame: string) => void;
  videoRef: any;
}

export const VideoHandler = ({ onFrameResult, videoRef }: VideoHandlerProps) => {
  const { devices, deviceId, setDeviceId, onUserMedia } = useCamera();

  return (
    <>
      <CameraSelect devices={devices} onChange={setDeviceId} />
      <VideoPlayer deviceId={deviceId} onUserMedia={onUserMedia} onFrameResult={onFrameResult} videoRef={videoRef} />
    </>
  );
};
import React, { useEffect, useRef, useState } from "react";
import { PoseEstimator } from "./PoseEstimator";
import { Spinner } from "../ui/Spinner";

interface VideoPlayerProps {
  onUserMedia: () => void;
  deviceId: string;
  onFrameResult: (frame: string) => void;
  videoRef: any;
}

// This component is responsible for getting the video feed
export const VideoPlayer = ({ onUserMedia, deviceId, onFrameResult, videoRef }: VideoPlayerProps) => {
  // const videoRef = useRef<HTMLVideoElement>(null);
  const [loading, setLoading] = useState(true); // Add a new state variable for loading

  useEffect(() => {
    const getMedia = async () => {
      const constraints = {
        video: {
          aspectRatio: 4 / 3,
          facingMode: "user",
          width: { min: 256 },
          height: { min: 144 },
          deviceId: deviceId ? { exact: deviceId } : undefined,
        },
      };

      try {
        const stream = await navigator.mediaDevices.getUserMedia(constraints);
        if (videoRef.current) {
          videoRef.current.srcObject = stream;
          onUserMedia();
          setLoading(false);
        }
      } catch (err) {
        console.error("Error accessing media devices.", err);
      }
    };

    getMedia();

    return () => {
      if (videoRef.current) {
        const tracks = (videoRef.current.srcObject as MediaStream).getTracks();
        tracks.forEach((track) => track.stop());
      }
    };
  }, [deviceId, onUserMedia]);

  return (
    <div
      style={{
        display: "flex",
        justifyContent: "center",
        alignItems: "center",
        position: "absolute",
        top: 0,
        left: 0,
        width: "100%",
        height: "100%",
        zIndex: 1,
      }}
    >
      <video
        width="640"
        height="480"
        style={{
          objectFit: "cover", // Ensures the aspect ratio is maintained
        }}
        autoPlay
        playsInline
        muted
        ref={videoRef}
      />
      <PoseEstimator videoRef={videoRef} onFrameResult={onFrameResult} />{" "}
      <div className="container">
        <Spinner loading={loading}></Spinner>
      </div>
    </div>
  );
};
import React, { useEffect, useRef, useState } from "react";
import { Pose } from "@mediapipe/pose";
import { poseConfig } from "../../utils/pose-utils";

interface PoseEstimatorProps {
  videoRef: React.RefObject<HTMLVideoElement>;
  onFrameResult: (results: any) => void;
}

const poseConfig = {
  locateFile: (file: any) => {
    return `https://cdn.jsdelivr.net/npm/@mediapipe/pose@latest/${file}`;
  },
};

// This component is responsible for getting the pose estimation
export const PoseEstimator = ({ videoRef, onFrameResult }: PoseEstimatorProps) => {
  useEffect(() => {
    const pose = new Pose(poseConfig);
    pose.setOptions({ modelComplexity: 1 });
    pose.onResults(onFrameResult);

    const processFrame = async () => {
      if (!videoRef.current) {
        return;
      }

      const { current: video } = videoRef;
      const canvas = document.createElement("canvas");
      const context = canvas.getContext("2d");
      if (!context) {
        return;
      }

      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      context.drawImage(video, 0, 0, canvas.width, canvas.height);

      await pose.send({ image: canvas });

      requestAnimationFrame(processFrame);
    };

    if (videoRef.current) {
      videoRef.current.onloadedmetadata = () => {
        if (videoRef.current) {
          videoRef.current.play();
          requestAnimationFrame(processFrame);
        }
      };
    }
  }, [videoRef, onFrameResult]);

  return null;
};
import { CSSProperties } from "react";
import { MoonLoader } from "react-spinners";

interface SpinnerInterface {
  loading: boolean;
}

export const Spinner = ({ loading }: SpinnerInterface) => {
  const style: CSSProperties = { position: "fixed", top: "50%", left: "50%", transform: "translate(-50%, -50%)" };

  return (
    <div style={style}>
      <MoonLoader loading={loading} color="white" size={100}></MoonLoader>
      {loading ? <div style={{ display: "flex", justifyContent: "center", marginTop: "5px" }}>Loading</div> : null}
    </div>
  );
};
import React from "react";

interface CameraSelectProps {
  devices: { deviceId: string; label?: string }[];
  onChange: (deviceId: string) => void;
}

const CameraSelect = ({ devices, onChange }: CameraSelectProps) => {
  const cameras = devices.map((device, index) => (
    <option key={index} value={device.deviceId}>
      {device.label || `Camera ${index + 1}`}
    </option>
  ));

  return (
    <select
      style={{
        position: "absolute",
        top: "10%",
        left: "50%",
        transform: "translateX(-50%)",
        zIndex: 1,
      }}
      onChange={(event) => onChange(event.target.value)}
    >
      {cameras}
    </select>
  );
};

export default CameraSelect;
// useCamera.ts
import { useState, useEffect, useCallback } from "react";

interface Device {
  deviceId: string;
  label?: string;
  kind: string;
}

const useCamera = () => {
  const [devices, setDevices] = useState<Device[]>([]);
  const [deviceId, setDeviceId] = useState<string>("");
  const [userMediaGranted, setUserMediaGranted] = useState<boolean>(false);

  const fetchDevices = useCallback(() => {
    navigator.mediaDevices.enumerateDevices().then((deviceList) => {
      setDevices(deviceList.filter((device) => device.kind === "videoinput"));
    });
  }, []);

  useEffect(() => {
    fetchDevices();
  }, [fetchDevices]);

  useEffect(() => {
    if (userMediaGranted) {
      fetchDevices();
    }
  }, [userMediaGranted, fetchDevices]);

  const onUserMedia = useCallback(() => {
    setUserMediaGranted(true);
  }, []);

  return { devices, deviceId, setDeviceId, onUserMedia };
};

export default useCamera;
import React, { useEffect, useRef, useState } from "react";
import { drawOnCanvas } from "./canvas-utils";

interface CanvasHandlerProps {
  getPoseResults: () => any;
  videoRef: any;
}

export const CanvasHandler = ({ getPoseResults, videoRef }: CanvasHandlerProps) => {
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const [context, setContext] = useState<CanvasRenderingContext2D | null>(null);

  useEffect(() => {
    if (!canvasRef.current) return;

    const newContext = canvasRef.current.getContext("2d");
    setContext(newContext);
  }, [canvasRef.current]);

  useEffect(() => {
    const handleResize = () => {
      if (canvasRef.current && videoRef.current) {
        canvasRef.current.width = videoRef.current.offsetWidth;
        canvasRef.current.height = videoRef.current.offsetHeight;
      }
    };

    handleResize();
    window.addEventListener("resize", handleResize);

    return () => {
      window.removeEventListener("resize", handleResize);
    };
  }, [canvasRef, videoRef]);

  useEffect(() => {
    if (!context) return;

    let animationFrameId: number; // Declare this variable to hold the requestAnimationFrame ID

    const drawFrame = () => {
      const poseResults = getPoseResults();

      if (!poseResults || !canvasRef.current) {
        animationFrameId = requestAnimationFrame(drawFrame);
        return;
      }

      drawOnCanvas(poseResults, context, canvasRef);

      animationFrameId = requestAnimationFrame(drawFrame);
    };

    animationFrameId = requestAnimationFrame(drawFrame);

    return () => {
      cancelAnimationFrame(animationFrameId); // Cancel the requestAnimationFrame using the ID
    };
  }, [getPoseResults, context]);

  return (
    <div
      style={{
        display: "flex",
        justifyContent: "center",
        alignItems: "center",
        position: "absolute",
        top: 0,
        left: 0,
        width: "100%",
        height: "100%",
        zIndex: 2,
      }}
    >
      <canvas
        style={{
          display: "block",
        }}
        ref={canvasRef}
      />
    </div>
  );
};
import { drawConnectors, drawLandmarks } from "@mediapipe/drawing_utils";
import { POSE_CONNECTIONS, POSE_LANDMARKS_LEFT, POSE_LANDMARKS_RIGHT } from "@mediapipe/pose";

export const drawOnCanvas = (poseResults: any, context: CanvasRenderingContext2D, canvasRef: any) => {
  if (!context || !canvasRef.current) return;

  context.save();
  context.clearRect(0, 0, canvasRef.current.width, canvasRef.current.height);

  drawConnectors(context, poseResults.poseLandmarks, POSE_CONNECTIONS, {
    color: "white",
  });
  drawLandmarks(
    context,
    Object.values(POSE_LANDMARKS_LEFT).map((index) => poseResults.poseLandmarks[index]),
    { visibilityMin: 0.65, color: "white", fillColor: "rgb(255,138,0)" }
  );
  drawLandmarks(
    context,
    Object.values(POSE_LANDMARKS_RIGHT).map((index) => poseResults.poseLandmarks[index]),
    { visibilityMin: 0.65, color: "white", fillColor: "rgb(0,217,231)" }
  );
  context.restore();
};