D435 box dimensions - Githubissues

AussieGrowls commented 6 years ago

Required Info
Camera Model	D435
Firmware Version	(05.09.02.00)
Operating System & Version	Win (10)
Kernel Version (Linux Only)
Platform	PC
SDK Version	2.16.1
Language	C#
Segment	Automation

Hi All, I am having trouble getting box dimensions from the the D435. The first issues is that the depth frame and color frames are not aligned because of the FOV. I can use the Align method, however it aligns every pixel and slows down the frame rate dramatically. Slows it down so much that the syncer throws an error after a minute or two: Frame didn't arrived within 5000

My objective is to measure a box from top down. Note the red circles is where I am trying to measure.

I can use the rs-measure example to accurately measure the box, however because the C# wrapper does not have a wrapper for rs2_deproject_pixel_to_point I had to write my own in c# (it is only Math processing). The results are very very wrong through. Is it because of the BrownConrady intrinsic model? the Assert fails in my C# version.

static float[] DeprojectPixelToPoint(Intrinsics intrin, PointF pixel, float depth) { //Debug.Assert(intrin.model != Distortion.BrownConrady); // Cannot deproject from a forward-distorted image Debug.Assert(intrin.model != Distortion.Ftheta); // Cannot deproject to an ftheta image var ret = new float[3]; float x = (pixel.X - intrin.ppx)/intrin.fx; float y = (pixel.Y - intrin.ppy)/intrin.fy; if (intrin.model == Distortion.BrownConrady) { float r2 = xx + yy; float f = 1 + intrin.coeffs[0]r2 + intrin.coeffs[1]r2r2 + intrin.coeffs[4]r2r2r2; float ux = xf + 2intrin.coeffs[2]xy + intrin.coeffs[3](r2 + 2xx); float uy = yf + 2intrin.coeffs[3]xy + intrin.coeffs[2](r2 + 2yy); x = ux; y = uy; } ret[0] = depthx; ret[1] = depthy; ret[2] = depth; return ret; }

Get distance 3d in c#:

public static float GetDistance_3d(this DepthFrame frame, PointF from, PointF to, Intrinsics intr) { // Query the frame for distance // Note: this can be optimized // It is not recommended to issue an API call for each pixel // (since the compiler can't inline these) // However, in this example it is not one of the bottlenecks

var vdist = frame.GetDistance((int)from.X, (int)from.Y); var udist = frame.GetDistance((int)to.X, (int)to.Y);

        // Deproject from pixel to point in 3D

        var upoint = DeprojectPixelToPoint(intr, from, udist);
        var vpoint = DeprojectPixelToPoint(intr, to, vdist);

        // Calculate euclidean distance between the two points
        return (float)Math.Sqrt(Math.Pow(upoint[0] - vpoint[0], 2) +
                    Math.Pow(upoint[1] - vpoint[1], 2) +
                    Math.Pow(upoint[2] - vpoint[2], 2));
    }

My main issue I think is converting points to pixel space [0,1]. If I run the measure example, and put a breakpoint in the code, and take the pixel x/y coordinates then put them into the c# app, the coordinates are not where they should be.

Any help would be much appreciated. Thanks Craig

HippoEug commented 6 years ago

I am trying to do this as well, but on C++ instead of C#.

@RealSense-Customer-Engineering Quick Question that is not really related: Why isn't the alignment of the depth and color frames done by default? Imho it would make more sense in either calculations or comparison etc. Is this because of performance? As Craig/AussieGrowls mentioned "Align method aligns every pixel and slows down the frame rate dramatically".

dorodnic commented 6 years ago

Hi @AussieGrowls First, what results do you observe in rs-measure and rs-align in terms of performance? Also, are you using the pre-compiled version of the SDK from the releases page? Generally, align on a regular PC today should be reasonably quick. D400 projection in use is distortion-less, so I don't think the model should cause any problems.

As for @HippoEug question - performance is one consideration, but there are more: a. If you want to enrich CV algorithm with depth (like getting depth to detected object) you will usually align Depth to Color, but when you want to enrich 3D algorithms with texture (like when 3D scanning) you'd align Color to Depth. It's up to the application to decide. b. Let's say you write an app where you control video conferencing with gestures. In this use case, you want the RGB as-is for the video, and Depth as-is for gestures. Sometimes you just don't need them together.

HippoEug commented 6 years ago

Ah that makes sense. A little confused as to why there would be a significant difference between aligning Color to Depth or Depth to Color. Do you mind elaborating abit more about that? Or is there a link to where I can find the relevant information to read that up on? Many many thanks! 😄

RealSense-Customer-Engineering commented 6 years ago

[Realsense Customer Engineering Team Comment] Hi @HippoEug,

Basically, the aligned base is different. Aligning color to depth is based on depth image. Aligning depth to color is based on color image. They have their own extrinsic parameters for transformation between point to point.

HippoEug commented 6 years ago

Sincere apologies to @AussieGrowls for hijacking the thread first of all.

@RealSense-Customer-Engineering thank you for your reply. A quick question:

For my use-case, I am trying to use the D435 to get the z-values/depth of a wound on the human body. The rough process is I provide the user a Color image/frame and get them to select the circumference of the wound. The program would then get these (x, y) coordinates from the user's input, and look up these (x, y) values from the Depth frame to get the z value.

In this particular use case, should I align color to depth, or depth to color? Danke!

RealSense-Customer-Engineering commented 6 years ago

[Realsense Customer Engineering Team Comment] Hi @HippoEug,

Since you provide a color image to interactive with users, you can try to align Color to Depth and get the correspondent depth from color pixel selected. You can also review #2523.

AussieGrowls commented 6 years ago

Thanks all, aligning to depth seems to have a small increase in performance. I still have the issue converting points to pixels though. The green circles in the above are the pixel coordinates I took from the rs_measure example, yet they are very different to the pixel coords in the c# app (the red circles). The SDK is built from source, not precompiled. This is the measurement I took in the rs_measure example: I put a breakpoint on this, and took the pixels (from and to) that the dist_3d method requires: ` auto from_pixel = s.ruler_start.get_pixel(depth);

auto to_pixel =   s.ruler_end.get_pixel(depth);

float air_dist = dist_3d(depth, from_pixel, to_pixel);`

The green circles in the first image are the pixel x/y from that breakpoint.

Edit: If I align depth to color, and run it at 640x360 res, the math calculation for the green point distance matches rs_measure: (0.24m or 24.4cm) However that is using pixel X/Y coords from rs_measure, which do not match the stream as shown in c# app.

AussieGrowls commented 6 years ago

Actually, I am having trouble aligning color frame to depth frame. I keep getting and AccessViolationException in Intel.Realsense.dll. The one frame I saw, before it threw the exception, was in fact what I was looking for regarding the location of the green circles (they aligned with the box corners as expected). However it keeps crashing. See code snipit below, taken from the cs-tutorial-3-processing example :

` public CaptureWindow() { try { pipeline = new Pipeline(); colorizer = new Colorizer(); Align align = new Align(Stream.Depth);

            var cfg = new Config();
            cfg.EnableStream(Stream.Depth, 1280, 720, Format.Z16);
            cfg.EnableStream(Stream.Color, 640, 360, Format.Bgr8);

            pipeline.Start(cfg);

            var token = tokenSource.Token;

            var t = Task.Factory.StartNew(() =>
            {
                while (!token.IsCancellationRequested)
                {
                    var frames = pipeline.WaitForFrames();
                    using (var releaser = new FramesReleaser())
                    {
                        var frames2 = align.Process(frames, releaser);
                        var colorized_depth_frame = colorizer.Colorize(frames2.DepthFrame);
                        var color_frame = frames2.ColorFrame;

                        UploadImage(imgDepth, colorized_depth_frame);
                        UploadImage(imgColor, color_frame);

                        // It is important to pre-emptively dispose of native resources
                        // to avoid creating bottleneck at finalization stage after GC
                        // (Also see FrameReleaser helper object in next tutorial)
                        frames2.Dispose();
                        frames.Dispose();
                        colorized_depth_frame.Dispose();
                        color_frame.Dispose();
                    }
                }
            }, token);
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message);
            Application.Current.Shutdown();
        }

        InitializeComponent();
    }

` aligned frame before crashing: looks like the color frame has an interlace issue?

AussieGrowls commented 6 years ago

Ok, I downloaded the latest 2.16.3, and it now works to align color to depth frame. using the cs-tutorial-2-capture: `using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading; using System.Threading.Tasks; using System.Windows; using System.Windows.Controls; using System.Windows.Data; using System.Windows.Documents; using System.Windows.Input; using System.Windows.Media; using System.Windows.Media.Imaging; using System.Windows.Navigation; using System.Windows.Shapes; using System.Windows.Threading;

namespace Intel.RealSense { ///

/// Interaction logic for Window.xaml ///

public partial class CaptureWindow : Window { private Pipeline pipeline; private Colorizer colorizer; private CancellationTokenSource tokenSource = new CancellationTokenSource();

    static Action<VideoFrame> UpdateImage(Image img)
    {
        var wbmp = img.Source as WriteableBitmap;
        return new Action<VideoFrame>(frame =>
        {
            using (frame)
            {
                var rect = new Int32Rect(0, 0, frame.Width, frame.Height);
                wbmp.WritePixels(rect, frame.Data, frame.Stride * frame.Height, frame.Stride);
            }
        });
    }

    public CaptureWindow()
    {
        InitializeComponent();

        try
        {
            Action<VideoFrame> updateDepth;
            Action<VideoFrame> updateColor;

            // The colorizer processing block will be used to visualize the depth frames.
            colorizer = new Colorizer();

            // Create and config the pipeline to strem color and depth frames.
            pipeline = new Pipeline();

            var cfg = new Config();
            cfg.EnableStream(Stream.Depth, 640, 480);
            cfg.EnableStream(Stream.Color, Format.Rgb8);

            var pp = pipeline.Start(cfg);

            SetupWindow(pp, out updateDepth, out updateColor);

            Task.Factory.StartNew(() =>
            {
                Align align = new Align(Stream.Depth);
                while (!tokenSource.Token.IsCancellationRequested)
                {
                    // We wait for the next available FrameSet and using it as a releaser object that would track
                    // all newly allocated .NET frames, and ensure deterministic finalization
                    // at the end of scope. 

                    using (var frames = pipeline.WaitForFrames())
                    {
                        var frames2 = align.Process(frames).DisposeWith(frames);
                        var colorFrame = frames2.ColorFrame.DisposeWith(frames2);
                        var depthFrame = frames2.DepthFrame.DisposeWith(frames2);

                        // We colorize the depth frame for visualization purposes, .
                        var colorizedDepth = colorizer.Process(depthFrame).DisposeWith(frames);

                        // Render the frames.
                        Dispatcher.Invoke(DispatcherPriority.Render, updateDepth, colorizedDepth);
                        Dispatcher.Invoke(DispatcherPriority.Render, updateColor, colorFrame);
                    }
                }
                align.Dispose();
            }, tokenSource.Token);
        }
        catch (Exception ex)
        {
            MessageBox.Show(ex.Message);
            Application.Current.Shutdown();
        }
    }

    private void control_Closing(object sender, System.ComponentModel.CancelEventArgs e)
    {
        tokenSource.Cancel();
    }

    private void SetupWindow(PipelineProfile pipelineProfile, out Action<VideoFrame> depth, out Action<VideoFrame> color)
    {
        using (var p = pipelineProfile.GetStream(Stream.Depth) as VideoStreamProfile)
            imgDepth.Source = new WriteableBitmap(p.Width, p.Height, 96d, 96d, PixelFormats.Rgb24, null);
        depth = UpdateImage(imgDepth);

        using (var p = pipelineProfile.GetStream(Stream.Color) as VideoStreamProfile)
            imgColor.Source = new WriteableBitmap(p.Width, p.Height, 96d, 96d, PixelFormats.Rgb24, null);
        color = UpdateImage(imgColor);
    }
}

} `

AussieGrowls commented 6 years ago

Ok now the issue is that the aligned color frame looks to be a blend or composite of color and depth frames, which makes it impossible to use open CV to get box dimensions. notice the noise on the color image?

joshberry commented 5 years ago

@AussieGrowls Did you ever figure this out?

I'm trying to perform measurements in C# as well and am getting inaccurate measurements (compared to rs_measure) which I believe is due to not correctly mapping the x/y coordinates. My image alignment seems to be working correctly.

I'd be very interested to know where you're at with it. Thanks!

AussieGrowls commented 5 years ago

I did. I am using Emgu, and realsense c# wrapper. Both the latest version. I had to download, compile and build the latest realsense 2.16.5 source, with the CSHARP option turned on in cmake. This is required as I think there was a bug in the version I had when starting this thread, and align just would not work. setting up, resolution 640 x 480 or 1280 x 720: ` pipeline = new Pipeline();

        var cfg = new Config();
        cfg.EnableStream(Stream.Depth, Convert.ToInt32(ConfigurationManager.AppSettings["ResolutionWidth"]),
            Convert.ToInt32(ConfigurationManager.AppSettings["ResolutionHeight"]), Format.Z16);
        cfg.EnableStream(Stream.Color, Convert.ToInt32(ConfigurationManager.AppSettings["ResolutionWidth"]),
            Convert.ToInt32(ConfigurationManager.AppSettings["ResolutionHeight"]), Format.Rgb8);

        try
        {
            pp = pipeline.Start(cfg);
        }
        catch (Exception ex)
        {
            Notify(
                "FATAL ERROR Starting process, cannot start video pipeline, ex: " +
                ex.Message, Color.Red);
            return;
        }
        depthintr = (pp.GetStream(Stream.Depth) as VideoStreamProfile).GetIntrinsics();`

for every frame, apply filters then convert to Iimage for emgu to use:

` using (var releaser = new FramesReleaser()) { using (var frames = pipeline.WaitForFrames().DisposeWith(releaser)) { //filters to be used, Hole filler seems to work best for color frame to then get canny edges from CV var processedFrames = frames //.ApplyFilter(decimate).DisposeWith(releaser) .ApplyFilter(spatial).DisposeWith(releaser) //.ApplyFilter(temp).DisposeWith(releaser) .ApplyFilter(holeFilter).DisposeWith(releaser) .ApplyFilter(align).DisposeWith(releaser) .ApplyFilter(colorizer).DisposeWith(releaser);

                var colorFrame = processedFrames.ColorFrame.DisposeWith(releaser);
                var colorizedDepth = processedFrames[Stream.Depth, Format.Rgb8].DisposeWith(releaser) as VideoFrame;
                var depthFrame = processedFrames.DepthFrame.DisposeWith(releaser);

                //convert to CV images
                Image<Bgr, Byte> ColorCV;
                Image<Gray, Byte> GrayCV;
                using (Bitmap image = new Bitmap(colorFrame.Width, colorFrame.Height, colorFrame.Stride,
                    System.Drawing.Imaging.PixelFormat.Format24bppRgb, colorFrame.Data))
                {
                    ColorCV = image.ToOpenCVImage<Bgr, byte>().DisposeWith(releaser);
                    GrayCV = image.ToOpenCVImage<Gray, byte>().DisposeWith(releaser);
                }
                Image<Bgr, Byte> DepthCV;
                using (Bitmap image = new Bitmap(depthFrame.Width, colorizedDepth.Height,
                    colorizedDepth.Stride,
                    System.Drawing.Imaging.PixelFormat.Format24bppRgb, colorizedDepth.Data))
                {
                    DepthCV = image.ToOpenCVImage<Bgr, byte>().DisposeWith(releaser);
                }

.........

I didnt use decimate filter, because it changes the resolution of the frame. I then just blur, dilate/erode, and canny edge detection in emgu. once I have the box(rotatedrect in emgu) I get the depth: var dX = (int) largestbox.Center.X; var dY = (int) largestbox.Center.Y; var depth = depthFrame.GetDistance(dX, dY);

//and measure it:

var verts = largestbox.GetVertices(); PointF left = verts[0]; PointF top = verts[1]; PointF right = verts[2]; PointF bottom = verts[3]; var wdist = depthFrame.GetDistance_3d(right, bottom, depthintr, depth); var hdist = depthFrame.GetDistance_3d(top, right, depthintr, depth); ` GetDistance_3d is one of my helper methods:

` public static float GetDistance_3d(this DepthFrame frame, PointF from, PointF to, Intrinsics intr, float singleDist = 0f) { // Query the frame for distance // Note: this can be optimized // It is not recommended to issue an API call for each pixel // (since the compiler can't inline these) // However, in this example it is not one of the bottlenecks

        float vdist;
        float udist;
        if (singleDist == 0f)
        {
            vdist = frame.GetDistance((int) @from.X, (int) @from.Y);
            udist = frame.GetDistance((int) to.X, (int) to.Y);
        }
        else vdist = udist = singleDist;

        // Deproject from pixel to point in 3D

        var upoint = DeprojectPixelToPoint(intr, from, udist);
        var vpoint = DeprojectPixelToPoint(intr, to, vdist);

        // Calculate euclidean distance between the two points
        return (float) Math.Sqrt(Math.Pow(upoint[0] - vpoint[0], 2) +
                                 Math.Pow(upoint[1] - vpoint[1], 2) +
                                 Math.Pow(upoint[2] - vpoint[2], 2));
    }

static float[] DeprojectPixelToPoint(Intrinsics intrin, PointF pixel, float depth) { //Debug.Assert(intrin.model != Distortion.BrownConrady); // Cannot deproject from a forward-distorted image Debug.Assert(intrin.model != Distortion.Ftheta); // Cannot deproject to an ftheta image var ret = new float[3]; float x = (pixel.X - intrin.ppx)/intrin.fx; float y = (pixel.Y - intrin.ppy)/intrin.fy; if (intrin.model == Distortion.BrownConrady) { float r2 = xx + yy; float f = 1 + intrin.coeffs[0]r2 + intrin.coeffs[1]r2r2 + intrin.coeffs[4]r2r2r2; float ux = xf + 2intrin.coeffs[2]xy + intrin.coeffs[3](r2 + 2xx); float uy = yf + 2intrin.coeffs[3]xy + intrin.coeffs[2](r2 + 2yy); x = ux; y = uy; } ret[0] = depthx; ret[1] = depthy; ret[2] = depth; return ret; } static float[] ProjectPointToPixel(Intrinsics intrin, float[] point) { float[] pixel = new float[2]; //assert(intrin->model != RS2_DISTORTION_INVERSE_BROWN_CONRADY); // Cannot project to an inverse-distorted image

        float x = point[0]/point[2], y = point[1]/point[2];

        if (intrin.model == Distortion.ModifiedBrownConrady)
        {

            float r2 = x*x + y*y;
            float f = 1 + intrin.coeffs[0]*r2 + intrin.coeffs[1]*r2*r2 + intrin.coeffs[4]*r2*r2*r2;
            x *= f;
            y *= f;
            float dx = x + 2*intrin.coeffs[2]*x*y + intrin.coeffs[3]*(r2 + 2*x*x);
            float dy = y + 2*intrin.coeffs[3]*x*y + intrin.coeffs[2]*(r2 + 2*y*y);
            x = dx;
            y = dy;
        }
        if (intrin.model == Distortion.Ftheta)
        {
            float r = (float) Math.Sqrt(x*x + y*y);
            float rd = (float) (1.0f/intrin.coeffs[0]*Math.Atan((2*r*Math.Tan(intrin.coeffs[0]/2.0f))));
            x *= rd/r;
            y *= rd/r;
        }

        pixel[0] = x*intrin.fx + intrin.ppx;
        pixel[1] = y*intrin.fy + intrin.ppy;
        return pixel;
    }

public static Image<TColor, TDepth> ToOpenCVImage<TColor, TDepth>(this Bitmap bitmap) where TColor : struct, IColor where TDepth : new() { return new Image<TColor, TDepth>(bitmap); } `

joshberry commented 5 years ago

@AussieGrowls Thanks for sharing. While comparing my code to yours, I found that a typo in my deprojection function was to blame for my inaccurate measurements.

FYI - I noticed what may be a mistake in your code. It looks like the assignment of the vdist and udist variables in your GetDistance_3d method may be reversed. I think vdist should store the distance of the "to" point, while udist should have the distance of the "from" point. The formatting of your code didn't come across very well, so maybe I'm misreading it, but wanted to give you a heads up.

AussieGrowls commented 5 years ago

@joshberry thanks for the pickup of the swap. The reads were still showing as right so I missed it.

RealSense-Customer-Engineering commented 5 years ago

[Realsense Customer Engineering Team Comment] Is everything clarified? Still need any support for this topic?

RealSense-Customer-Engineering commented 5 years ago

[Realsense Customer Engineering Team Comment] I will close this

SchnoeggiSch commented 2 years ago

I would like to measure the object size using Intel RealSense. During the research I found this issue.

@AussieGrowls is there a repository for your project where I can also see how you solved the image acquisition and image renering in the GUI (drawing the lines, points, ....)?

IntelRealSense / librealsense

D435 box dimensions #2657