Unity-Technologies / barracuda-release

Other
564 stars 76 forks source link

AR Foundation + Barracuda Object Detection Unity 2021 #292

Open StormWeaver opened 2 years ago

StormWeaver commented 2 years ago

I have recently started learning to use ML and implementing it into unity. I have started working with the Git project Unity_Detection2AR and have done my best to update some depreciated parts to match Barracuda 2.0. trying to follow some of the guiding code from the IWorker example code.

After managing to get it running smoothly via camera, I have noticed that the actual Detect IEnumerator function is taking a considerable amount of time, between 3-6 seconds per cycle. The area where it gets held up is fairly clear and makes complete sense, as it stops up at the WaitForCompletion(output) line.

I can improve performance by lowering the IMAGE_SIZE and subsequent ROW_COUNT, however even lowering it down to 160x160(px) results in a reduction down to 1-2 seconds per cycle and with very poor recognition (as was expected). I see other people commenting negatively about seeing many milliseconds of time needed and I cant help but wonder if I am just handling the process incorrectly.

Assuming I am not doing anything heinously wrong (and if I am, any suggestions would be fantastic), what options might I have to try and wiggle out some better runtimes?

Original Code

public IEnumerator Detect(Color32[] picture, System.Action<IList<BoundingBox>> callback)
    {
        using (var tensor = TransformInput(picture, IMAGE_SIZE, IMAGE_SIZE))
        {
            var inputs = new Dictionary<string, Tensor>();
            inputs.Add(INPUT_NAME, tensor);
            yield return StartCoroutine(worker.StartManualSchedule(inputs));
            //worker.Execute(inputs);
            var output = worker.PeekOutput(OUTPUT_NAME);
            Debug.Log("Output: " + output);
            var results = ParseOutputs(output, MINIMUM_CONFIDENCE);
            var boxes = FilterBoundingBoxes(results, 5, MINIMUM_CONFIDENCE);
            callback(boxes);
        }
    }

Personal Code

public IEnumerator Detect(Color32[] picture, System.Action<IList<BoundingBox>> callback)
    {
        using (var tensor = TransformInput(picture, IMAGE_SIZE, IMAGE_SIZE))
        {
            var inputs = new Dictionary<string, Tensor>();
            inputs.Add(INPUT_NAME, tensor);
            worker.StartManualSchedule(inputs); /*changed*/
            var output = worker.Execute(inputs).PeekOutput(OUTPUT_NAME);  /*changed*/
            yield return new WaitForCompletion(output);  /*changed*/
            var results = ParseOutputs(output, MINIMUM_CONFIDENCE);
            var boxes = FilterBoundingBoxes(results, 5, MINIMUM_CONFIDENCE);
            callback(boxes);
        }
    }

I sincerely apologize if anything I am doing is entirely off base, I struggled to find recent sources going through the process of training an object detection model all the way through to unity, so much of my understanding is hodge-podge. I appreciate any guidance I can get.