SciSharp / TensorFlow.NET

.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
https://scisharp.github.io/tensorflow-net-docs
Apache License 2.0
3.2k stars 514 forks source link

NullReferenceException thrown in Tensorflow.Binding.dll when trying to run EfficientDetD0 #1077

Open arjOman opened 1 year ago

arjOman commented 1 year ago

Description

 public class tfDetEngine : DetectionEngine
    {
        private readonly Session _session;

        public tfDetEngine(string modelPath, string labelPath) : base(modelPath, labelPath)
        {
            _session = Session.LoadFromSavedModel(modelPath);
        }

        public override IEnumerable<Detection> Detect(Tensor image)
        {

            using (_session)
            {
                try
                {
                    var tensors = _session?.eval(image);
                    if (tensors != null)
                    {
                        var boxesP = tensors["detection_boxes"].ToArray<float>();

                        var boxes = new float[boxesP.Length / 4][];
                        for (int i = 0; i < boxesP.Length; i += 4)
                        {
                            boxes[i / 4] = new float[] { boxesP[i], boxesP[i + 1], boxesP[i + 2], boxesP[i + 3] };
                        }

                        var scores = tensors["detection_scores"].ToArray<float>();
                        var classes = tensors["detection_classes"].ToArray<float>();

                        var detectedObjects = boxes.Select((box, i) => new Detection
                        {
                            Box = box,
                            Score = scores[i],
                            Class = classes[i]
                        });

                        return detectedObjects;
                    }

                    return new List<Detection>();
                }

                catch (NullReferenceException e)
                {
                    return new List<Detection>();
                }
            }
        }

I keep getting NullReferenceException thrown in Tensorflow.Binding.dll whenever I try to evaluate the model with a (512, 512, 3) sized image tensor

AsakusaRinne commented 1 year ago

Could you please provide the stack trace of the exception?

arjOman commented 1 year ago

The stack trace goes here: at Tensorflow.Tensor._as_tf_output() at Tensorflow.BaseSession.eval(Tensor tensor) at Detection.Engines.tfDetEngine.Detect(Tensor image)

AsakusaRinne commented 1 year ago

Please check if the image is an EagerTensor. Using an EagerTensor under graph mode may cause this problem.

arjOman commented 1 year ago

The tensor isn't an EagerTensor indeed {tf.Tensor '' shape=(512, 512, 3) dtype=uint8}

AsakusaRinne commented 1 year ago

Could you please provide some steps to reproduce it? I don't know what DetectionEngine is.

arjOman commented 1 year ago

Here is the DetectionEngine class:

 public abstract class DetectionEngine
    {
        public DetectionEngine(string modelPath, string labelPath)
        {
            ModelPath = modelPath;
            LabelPath = labelPath;
        }
        public string ModelPath { get; set; }
        public string LabelPath { get; set; }
        public abstract IEnumerable<Detection> Detect(Tensor frame);
    }

Here is where I run inference:


        public void StartVideo()
        {
            var devices = new List<int>();
            while (devices.Count == 0)
            {
                devices = GetDevices();
            }

            Environment.SetEnvironmentVariable("TF_CPP_MIN_LOG_LEVEL", "2");  // Disable TensorFlow logs
            Environment.SetEnvironmentVariable("OMP_NUM_THREADS", "4");     // Set the number of threads

            new Thread(() => {
                engine = new tfDetEngine(@"C:\Users\softm\Downloads\efficientdet_d0_coco17_tpu-32\efficientdet_d0_coco17_tpu-32\saved_model", "");
            }).Start();

            foreach (int device in devices)
            {
                StartVideoDaemon(device);
            }
        }

        async Task StartVideoDaemon(int id)
        {
            await Task.Factory.StartNew(() =>
            {
                VideoCapture capture = new(id);
                Mat frame = new();
                Bitmap img;
                while (true)
                {
                    capture.Read(frame);
                    if (frame.Empty())
                    {
                        break;
                    }

                    Mat frameN = new Mat();
                    Cv2.Resize(frame, frameN, new Size(512, 512));

                    var detections = engine?.Detect(ToTensor(frameN));

                    using (MemoryStream memory = frameN.ToMemoryStream())
                    {
                        img = new Bitmap(memory);

                        var port = "port" + (id + 1);
                        Dispatcher.UIThread.InvokeAsync(() =>
                        {
                            var image = this.FindControl<Image>(port);
                            if (image != null)
                            {
                                image.Source = img;
                            }
                        });
                    }
                    Task.Delay(10);
                }
            });

        }

        public static unsafe Tensor ToTensor(Mat src)
        {
            NumSharp.Shape shape = (src.Height, src.Width, src.Type().Channels);
            SafeTensorHandle handle;
            var tensor = new Tensor(handle = c_api.TF_AllocateTensor(TF_DataType.TF_UINT8, shape.Dimensions.Select(d => (long)d).ToArray(), shape.NDim, (ulong)shape.Size));

            new UnmanagedMemoryBlock<byte>(src.DataPointer, shape.Size)
                .CopyTo((byte*)c_api.TF_TensorData(handle));

            return tensor;
        }

        List<int> GetDevices()
        {
            var devices = new List<int>();
            int i = 0;
            while (true)
            {
                var cap = new VideoCapture(i);
                if (cap.IsOpened())
                {
                    devices.Add(i);
                    i++;
                    continue;
                }
                break;
            }
            return devices;
        }
arjOman commented 1 year ago

I am using [EfficientDet D0 512x512] (http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz)

AsakusaRinne commented 1 year ago

Ok, I'll reproduce it and give you a feedback later. Which version are you using?

arjOman commented 1 year ago

I am using the latest one, just got it yesterday

On Thu, 18 May, 2023 at 11:27 AM, Rinne @.***> wrote:

Ok, I'll reproduce it and give you a feedback later. Which version are you using?

— Reply to this email directly, view it on GitHub https://github.com/SciSharp/TensorFlow.NET/issues/1077#issuecomment-1552419015, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIIF6IKXZ5VL3YMJMTLQNLXGWXN5ANCNFSM6AAAAAAYF43PJE . You are receiving this because you authored the thread.Message ID: @.***>

AsakusaRinne commented 1 year ago

The UnmanagedMemoryBlock and Dispatcher are not defined in the code. Is there any extra information?

arjOman commented 1 year ago

On Thu, 18 May, 2023 at 12:43 PM, Rinne @.***> wrote:

The UnmanagedMemoryBlock and Dispatcher are not defined in the code. Is there any extra information?

— Reply to this email directly, view it on GitHub https://github.com/SciSharp/TensorFlow.NET/issues/1077#issuecomment-1552551577, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIIF6OQQAHYXM3EQJHDOE3XGXAKTANCNFSM6AAAAAAYF43PJE . You are receiving this because you authored the thread.Message ID: @.***>

I am using Avalonia UI with it, that’s where the dispatcher comes from, and speaking of UnmanagedMemoryBlock, it comes from NumSharp.

AsakusaRinne commented 1 year ago

The error is caused by some wrong usages. Note that tf.net has no longer depended on NumSharp, please use Tensorflow.Numpy instead. Here's an example without this problem:

using OpenCvSharp;
using System;
using System.Collections.Generic;
using System.Linq;
using Tensorflow;
using Tensorflow.NumPy;
using static Tensorflow.Binding;

namespace TensorFlowNET.UnitTest.Training
{
    public record class Detection(float[] Box, float Score, float Class);

    public abstract class DetectionEngine
    {
        public DetectionEngine(string modelPath, string labelPath)
        {
            ModelPath = modelPath;
            LabelPath = labelPath;
        }
        public string ModelPath { get; set; }
        public string LabelPath { get; set; }
        public abstract IEnumerable<Detection> Detect(NDArray frame);
    }

    public class tfDetEngine : DetectionEngine
    {
        private readonly Session _session;

        public tfDetEngine(string modelPath, string labelPath) : base(modelPath, labelPath)
        {
            _session = Session.LoadFromSavedModel(modelPath);
        }

        public override IEnumerable<Detection> Detect(NDArray image)
        {
            var graph = _session.graph;
            graph.as_default();
            try
            {
                var assigned = tf.convert_to_tensor(image, name: "input_image");
                bool a = tf.Context.executing_eagerly();
                var tensors = _session?.run(assigned);
                if (tensors != null)
                {
                    var boxesP = tensors["detection_boxes"].ToArray<float>();

                    var boxes = new float[boxesP.Length / 4][];
                    for (int i = 0; i < boxesP.Length; i += 4)
                    {
                        boxes[i / 4] = new float[] { boxesP[i], boxesP[i + 1], boxesP[i + 2], boxesP[i + 3] };
                    }

                    var scores = tensors["detection_scores"].ToArray<float>();
                    var classes = tensors["detection_classes"].ToArray<float>();

                    var detectedObjects = boxes.Select((box, i) => new Detection
                    (
                        Box: box,
                        Score: scores[i],
                        Class: classes[i]
                    ));

                    return detectedObjects;
                }

                return new List<Detection>();
            }

            catch (NullReferenceException e)
            {
                return new List<Detection>();
            }
            finally
            {
                graph.Exit();
            }
        }
    }
    [TestClass]
    public class ObjectDetection
    {
        tfDetEngine engine;

        [TestMethod]
        public void DetectionWithEfficientDet()
        {
            StartVideo();
        }

        public void StartVideo()
        {
            var devices = new List<int>();
            while (devices.Count == 0)
            {
                devices = GetDevices();
            }

            Environment.SetEnvironmentVariable("TF_CPP_MIN_LOG_LEVEL", "2");  // Disable TensorFlow logs
            Environment.SetEnvironmentVariable("OMP_NUM_THREADS", "4");     // Set the number of threads

            engine = new tfDetEngine(@"C:\Users\liu_y\Downloads\efficientdet_d0_coco17_tpu-32\saved_model", "");

            foreach (int device in devices)
            {
                StartVideoDaemon(device);
            }
        }

        void StartVideoDaemon(int id)
        {
            VideoCapture capture = new(id);
            Mat frame = new();
            System.Drawing.Bitmap img;
            while (true)
            {
                capture.Read(frame);
                if (frame.Empty())
                {
                    break;
                }

                Mat frameN = new Mat();
                Cv2.Resize(frame, frameN, new Size(512, 512));

                var detections = engine?.Detect(ToTensor(frameN));
            }

        }

        public unsafe NDArray ToTensor(Mat src)
        {
            Shape shape = (src.Height, src.Width, src.Type().Channels);
            SafeTensorHandle handle;
            var tensor = new Tensor(handle = c_api.TF_AllocateTensor(TF_DataType.TF_UINT8, shape.dims.Select(d => (long)d).ToArray(), shape.ndim, (ulong)shape.size));

            new Span<byte>((void*)src.DataPointer, (int)shape.size)
                .CopyTo(new Span<byte>((byte*)c_api.TF_TensorData(handle), (int)shape.size));

            return tensor.numpy();
        }

        List<int> GetDevices()
        {
            var devices = new List<int>();
            int i = 0;
            while (true)
            {
                var cap = new VideoCapture(i);
                if (cap.IsOpened())
                {
                    devices.Add(i);
                    i++;
                    continue;
                }
                break;
            }
            return devices;
        }
    }
}

However it will still fail when executing var boxesP = tensors["detection_boxes"].ToArray<float>();. I know few about object detection so that I can't figure what is expected. Please check it again.

arjOman commented 1 year ago

_session.run still leaves me with a NullReferenceException, here is the stack trace

   at Tensorflow.BaseSession._call_tf_sessionrun(KeyValuePair`2[] feed_dict, TF_Output[] fetch_list, List`1 target_list)
   at Tensorflow.BaseSession._do_run(List`1 target_list, List`1 fetch_list, Dictionary`2 feed_dict)
   at Tensorflow.BaseSession._run(Object fetches, FeedItem[] feed_dict)
   at Tensorflow.BaseSession.run(Tensor fetche, FeedItem[] feed_dict)
   at Detection.Engines.tfDetEngine.Detect(NDArray image)
AsakusaRinne commented 1 year ago

Are you using exactly the code I listed before?

arjOman commented 1 year ago

yes, still doesn't work

arjOman commented 1 year ago

Could it be that I have an installation issue? I might have a broken installation of Tensorflow. How can I test that TensorFlow.net will run using a simpler model?

AsakusaRinne commented 1 year ago

Could it be that I have an installation issue? I might have a broken installation of Tensorflow. How can I test that TensorFlow.net will run using a simpler model?

Please refer to https://github.com/SciSharp/SciSharp-Stack-Examples/tree/master/src/TensorFlowNET.Examples and choose one or two simple models of it

AsakusaRinne commented 1 year ago

yes, still doesn't work

I can't reproduce it at this time. Is there any further information?

arjOman commented 1 year ago

I think I have issues with my tensorflow installation or something. Everything should work fine. I am using SciSharp.Tensorflow.Redist as my laptop doesn't have a GPU but my PC does. What steps should I follow to make sure that my initial setups are correct? I suspect my setup is faulty, as you have a perfectly running Tensorflow.Net implementation on your end.

AsakusaRinne commented 1 year ago

Sorry, my bad. My local environment is actually the master branch. Could you please try it again with master branch? Please clone it and add a project reference but keep tensorflow.redist installed. We'll publish a nightly release if master branch fixes this problem.

arjOman commented 1 year ago

Pardon, which repo do I have to clone now?

AsakusaRinne commented 1 year ago

Pardon, which repo do I have to clone now?

Please clone this repo and reference the src/TensorFlowNET.Core/tensorflow.net.csproj

arjOman commented 1 year ago

But, there is no tensorflow.net.csproj but rather Tensorflow.Binding.csproj

On Thu, 18 May, 2023 at 10:57 PM, Rinne @.***> wrote:

Pardon, which repo do I have to clone now?

Please clone this repo and reference the src/TensorFlowNET.Core/tensorflow.net.csproj

— Reply to this email directly, view it on GitHub https://github.com/SciSharp/TensorFlow.NET/issues/1077#issuecomment-1553344579, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIIF6PBM42CU3TYWCZOXCDXGZIJNANCNFSM6AAAAAAYF43PJE . You are receiving this because you authored the thread.Message ID: @.***>

AsakusaRinne commented 1 year ago

Yup, that's it. I got it wrong.

arjOman commented 1 year ago

I tried it with the source code, but it gave me the same error, but a little detailed. When debugging, I notcied that _session.status is null. Here is the stack trace:

   at Tensorflow.Status.op_Implicit(Status status) in C:\Users\softm\Downloads\TensorFlow.NET\src\TensorFlowNET.Core\Status\Status.cs:line 102
   at Tensorflow.BaseSession._call_tf_sessionrun(KeyValuePair`2[] feed_dict, TF_Output[] fetch_list, List`1 target_list) in C:\Users\softm\Downloads\TensorFlow.NET\src\TensorFlowNET.Core\Sessions\BaseSession.cs:line 216
   at Tensorflow.BaseSession._do_run(List`1 target_list, List`1 fetch_list, Dictionary`2 feed_dict) in C:\Users\softm\Downloads\TensorFlow.NET\src\TensorFlowNET.Core\Sessions\BaseSession.cs:line 205
   at Tensorflow.BaseSession._run(Object fetches, FeedItem[] feed_dict) in C:\Users\softm\Downloads\TensorFlow.NET\src\TensorFlowNET.Core\Sessions\BaseSession.cs:line 133
   at Tensorflow.BaseSession.run(Tensor fetche, FeedItem[] feed_dict) in C:\Users\softm\Downloads\TensorFlow.NET\src\TensorFlowNET.Core\Sessions\BaseSession.cs:line 57
arjOman commented 1 year ago

Just curious, what should I put in the place of feed_dict params?

AsakusaRinne commented 1 year ago

Sorry for the confusion, I've fixed the error of null reference of status, could you please pull the newest master branch and try again?

AsakusaRinne commented 1 year ago

feed_dict

feed_dict is a parameter to specify some initial values for the parameters

arjOman commented 1 year ago

Now I get "Invalid slice notation: 'detection_boxes"

AsakusaRinne commented 1 year ago

Now I get "Invalid slice notation: 'detection_boxes"

Yes, that's the same with what acts like in my local environment. Since I don't know much about object detection, could you please further explain what you want by var boxesP = tensors["detection_boxes"].ToArray<float>();?

arjOman commented 1 year ago

So, I couldn't convert boxes to Float array directly, because it is nullable. I couldn't just write it as ToArray<float[]>(), it should actually be a two dimensional array, and the extra for loop is there trying to convert the float[] to float[][]

AsakusaRinne commented 1 year ago

What are tensors and boxesP expected to be in object detection?

arjOman commented 1 year ago

tensors is the detections from the model, and boxes are the bounding boxes from detections.

arjOman commented 1 year ago

But graph has no nodes, everything shows 0. Maybe the model isn't loaded correctly?

AsakusaRinne commented 1 year ago

@Oceania2018 Could you please look at this issue?