google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.43k stars 5.15k forks source link

How to use PoseLandmarker and AudioClassifier simultaneously? #5219

Open niubitily opened 7 months ago

niubitily commented 7 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

ArchLinux 6.7.3

MediaPipe Tasks SDK version

0.20230731'

Task name (e.g. Image classification, Gesture recognition etc.)

tasks-vision,tasks-audio

Programming Language and version (e.g. C++, Python, Java)

Kotlin

Describe the actual behavior

The error occurs when calling createFromOptions for the second time.

Describe the expected behaviour

use PoseLandmarker and AudioClassifier simultaneously

Standalone code/steps you may have used to try to get what you need

postLandmarkerBgExecutor = Executors.newSingleThreadExecutor()
        postLandmarkerBgExecutor.execute {
            poseLandmarkerHelper = PoseLandmarkerHelper(
                context = requireContext(),
                runningMode = RunningMode.LIVE_STREAM,
                minPoseDetectionConfidence = viewModel.currentMinPoseDetectionConfidence,
                minPoseTrackingConfidence = viewModel.currentMinPoseTrackingConfidence,
                minPosePresenceConfidence = viewModel.currentMinPosePresenceConfidence,
                currentDelegate = viewModel.currentDelegate,
                poseLandmarkerHelperListener = object: PoseLandmarkerHelper.LandmarkerListener {
                    override fun onPoseLandmarkerError(error: String, errorCode: Int) {
                        activity?.runOnUiThread {
                            Toast.makeText(requireContext(), error, Toast.LENGTH_SHORT).show()
                            if (errorCode == PoseLandmarkerHelper.GPU_ERROR) {
                                fragmentCameraBinding.bottomSheetLayout.spinnerDelegate.setSelection(
                                    PoseLandmarkerHelper.DELEGATE_CPU, false
                                )
                            }
                        }
                    }

                    override fun onPoseLandmarkerResults(resultBundle: PoseLandmarkerHelper.ResultBundle) {
                        activity?.runOnUiThread {
                            if (_fragmentCameraBinding != null) {
                                fragmentCameraBinding.bottomSheetLayout.inferenceTimeVal.text =
                                    String.format("%d ms", resultBundle.inferenceTime)

                                // Pass necessary information to OverlayView for drawing on the canvas
                                fragmentCameraBinding.overlay.setResults(
                                    resultBundle.results.first(),
                                    resultBundle.inputImageHeight,
                                    resultBundle.inputImageWidth,
                                    RunningMode.LIVE_STREAM
                                )

                                // Force a redraw
                                fragmentCameraBinding.overlay.invalidate()
                            }
                        }
                    }
                }
            )
        }

        audioClassifierBgExecutor = Executors.newSingleThreadExecutor()
        audioClassifierBgExecutor.execute {
            Thread.sleep(1000);
            audioClassifierHelper = AudioClassifierHelper(
                context = requireContext(),
                runningMode = com.google.mediapipe.tasks.audio.core.RunningMode.AUDIO_STREAM,
                classificationThreshold = 0.3f,
                overlap = 2,
                numOfResults = 2,
                audioClassifierListener = object: AudioClassifierHelper.ClassifierListener {
                    override fun onError(error: String) {
                        activity?.runOnUiThread {
                            Toast.makeText(requireContext(), error, Toast.LENGTH_SHORT).show()
                        }
                    }

                    override fun onResult(resultBundle: AudioClassifierHelper.ResultBundle) {
                        resultBundle.results[0].classificationResults().first()
                            .classifications()?.get(0)?.categories()?.let { categories ->
                                // Loop through the categories and print them
                                var i: Int = 0
                                categories.forEach { category ->
                                    MyLogger.log("${LOG_TAG}.onResult--${i}, Category: ${category.categoryName()}, Probability: ${category.score()}")
                                    i++
                                }
                            }
                    }
                }
            )
        }

Other info / Complete Logs

The error message is as follows:
com.google.mediapipe.framework.MediaPipeException: not found: ValidatedGraphConfig Initialization failed. No registered object with name: mediapipe::tasks::audio::audio_classifier::AudioClassifierGraph; Unable to find Calculator "mediapipe.tasks.audio.audio_classifier.AudioClassifierGraph"
kuaashish commented 7 months ago

Hi @niubitily,

This message indicates that your application lacks the code for "mediapipe.tasks.audio.audio_classifier.AudioClassifierGraph".

Kindly review your dependencies to guarantee the appropriate linkage of the required code.

Would you mind verifying the dependency on 'com.google.mediapipe:tasks-vision'? For additional details, please visit: https://developers.google.com/mediapipe/solutions/vision/face_detector/android#dependencies.

Thank you!!

niubitily commented 7 months ago

Hi @niubitily,

This message indicates that your application lacks the code for "mediapipe.tasks.audio.audio_classifier.AudioClassifierGraph".

Kindly review your dependencies to guarantee the appropriate linkage of the required code.

Would you mind verifying the dependency on 'com.google.mediapipe:tasks-vision'? For additional details, please visit: https://developers.google.com/mediapipe/solutions/vision/face_detector/android#dependencies.

Thank you!!

I believe there might be a misinterpretation. When I individually call PoseLandmarker or AudioClassifier, everything works fine. Below is the reference in my build.gradle (Module):

// MediaPipe Library
implementation 'com.google.mediapipe:tasks-vision:0.20230731'

// Mediapipe Library
implementation 'com.google.mediapipe:tasks-audio:0.20230731'

However, when I execute: poseLandmarker = PoseLandmarker.createFromOptions(context, options) followed by: audioClassifier = AudioClassifier.createFromOptions(context, options) the second "createFromOptions" operation fails. It doesn't matter which task I initialize first. Therefore, I suspect there might be some common resource conflicts during these calls."