JdeRobot / base

GNU General Public License v3.0
94 stars 124 forks source link

Critical state of jderobot::Camera (a deep analysis) #248

Closed varhub closed 3 years ago

varhub commented 8 years ago

I will try to be politically correct, by the way I'll try to expose all odds and found leaks; so I will start with a simple phrase:

Is hard to understand how is possible that camera, one of more important sensors, is almost not implemented and basic functionality is not provided.

Let's start with how many times is defined CameraI (the implementation of jderobot::Camera Interface): grep -lr "public jderobot::Camera" | wc -l == 18 (at least)

Here you are some of them, trying to order implementations for recent to ancient just to see evolution and deprecation:

The best example of copy-paste is camera_dump.cc, the best candidate to converge in a single .so shared by all robots (please, do a diff before think that i'm being too rude).

About bugs, here we go: https://github.com/RoboticsURJC/JdeRobot/blob/5f5da3/src/stable/components/gazeboserver/plugins/quadrotor/quadrotorplugin.cc#L568 Here there is a condition race that implies that last "preprocessed" request will override reply format. So any queued request "processed" afterward will get a wrong format. It has been fixed at https://github.com/RoboticsURJC/JdeRobot/blob/5f5da3/src/stable/libs/jderobotutil/CameraTask.cpp#L30-L33, but this error is at now presented at least at: quadrotor, flyingKinect, kinect, RosLib. The good news: since no one uses other format that RGB8 and server side do not support it, this bug will never happen.

List continue, but even I forgot some saw problems. By the way I suspect that it is simply better a ToDo list.

Problems

Finally, an unimportant question: who can explain why this ugly call? colorspaces::ImageRGB8::FORMAT_RGB8.get()->name Is there a double indirection? (**colorspaces::ImageRGB8::FORMAT_RGB8).name

varhub commented 8 years ago

[*starvation]: Current queued responses could end in an starvation case The good part: Since request are queued and other thread threat it, any client can "connect" to an uninitialized camera. The bad part: If dedicated thread fail or hang silently, queue will grow forever, and implies a blocking call until request timeout rescues client.

See: AMD is transparent to the client, that is, there is no way for a client to distinguish a request that, in the server, is processed synchronously from a request that is processed asynchronously. So client call will be always blocking. From https://doc.zeroc.com/display/Ice36/Asynchronous+Method+Dispatch+%28AMD%29+in+Java

Due most components does all sensor fetching (an even control) in same thread loop, hang at image acquisition implies lost sensor feedback and even lost control.

It is only a hypothesis and has not been verified

varhub commented 8 years ago

Thanks @chanfr, this (https://github.com/RoboticsURJC/JdeRobot/blob/master/src/stable/libs/parallelIce/cameraClient.h) was I meant, but for server side...

I couldn't avoid look into it, and I saw a condition race here: https://github.com/RoboticsURJC/JdeRobot/blob/5f5da37edfe16a9a5b3d6eac080e8c694ba19ec0/src/stable/libs/parallelIce/cameraClient.cpp#L193-L195

        if (pauseStatus){
            IceUtil::Mutex::Lock sync(this->controlMutex);
            this->semWait.wait(sync);
        }

It seems to be a bad use of scoped RAII lock guard. It must be:

    {
        IceUtil::Mutex::Lock sync(this->controlMutex);
        if (pauseStatus){
            this->semWait.wait(sync);
        }
    }

So mainly, pause variable is not thread safe and may end in an starvation situation ;)

PS: I also prefer a two-semaphores "design" instead only one (because is easy to recognize who are consumers and who are producers). But since API is designed for a single consumer (and blocking), special case of a two consumer threads getting spawn from each other is a bad usage of API itself.

chanfr commented 8 years ago

Sure, you are right, please send a pull request with this change and I will include it.

Thanks!