roboticslab-uc3m / yarp-devices

A place for YARP devices
https://robots.uc3m.es/yarp-devices/
9 stars 7 forks source link

Transmit delays in the new ICanBus arquitecture #191

Closed PeterBowman closed 6 years ago

PeterBowman commented 6 years ago

As a result of #158 (which also affects #190), excessive delays have been observed during the initiallization of CanBusControlboard (e.g. via launchManipulation). We noticed this during usual operation (after init, before close), too, when fewer CAN nodes are opened, e.g. one iPOS and no CUIs attached to it. Regarding the recently reworked CanBusHico device, this instruction might block further operation up to a few seconds:

https://github.com/roboticslab-uc3m/yarp-devices/blob/2a42f3c9335f82dc225abfde64160a277db0d2ca/libraries/YarpPlugins/CanBusHico/ICanBusImpl.cpp#L275

What is happening behind the scenes: a background thread owned by CanBusControlboard is constantly performing calling canRead once and again. This is a high-rate operation with no delays involved. In the main thread, we (occasionally) want to call canWrite, which is prone to be blocked by canRead as a common semaphore object avoids simultaneous execution of both methods at the device level (::read(), ::write(), ::select()). However, I noticed that after canRead returns (and the semaphore is released), the reading thread enters it again and locks the semaphore even if there was a pending canWrite call already waiting at canBusReady.wait() (our semaphore, see previous snippet). Using a yarp::os::Mutex object renders the same result.

Ideas:

PeterBowman commented 6 years ago

Sample app resembling the behavior we want to achieve:

#include <iostream>

#include <yarp/os/Network.h>
#include <yarp/os/RateThread.h>
#include <yarp/os/Semaphore.h>
#include <yarp/os/Thread.h>
#include <yarp/os/Time.h>

struct MyClass
{
    void read() const
    {
        double start = yarp::os::Time::now();
        ioReady.wait();
        // perform some non-CPU-intensive operations
        int c = 0;
        for (int i = 0; i < 1000; i++) { c *= 2; }
        std::cout << "read-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.post();
    }

    void write() const
    {
        double start = yarp::os::Time::now();
        std::cout << "write-wait" << std::endl;
        ioReady.wait();
        std::cout << "write-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.post();
    }

    mutable yarp::os::Semaphore ioReady;
};

struct ReadThread : public yarp::os::Thread
{
    ReadThread(const MyClass & _mc) : mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting read thread..." << std::endl;
        return yarp::os::Thread::start();
    }

    virtual bool stop()
    {
        std::cout << "Stopping read thread..." << std::endl;
        return yarp::os::Thread::stop();
    }

    virtual void run()
    {
        while (!isStopping())
        {
            mc.read();
        }
    }

    const MyClass & mc;
};

struct WriteThread : public yarp::os::RateThread
{
    WriteThread(const MyClass & _mc) : yarp::os::RateThread(10), mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting write thread..." << std::endl;
        return yarp::os::RateThread::start();
    }

    virtual void stop()
    {
        std::cout << "Stopping write thread..." << std::endl;
        yarp::os::RateThread::stop();
    }

    virtual void run()
    {
        mc.write();
    }

    const MyClass & mc;
};

int main()
{
    yarp::os::Network yarp;

    MyClass mc;
    ReadThread rt(mc);
    WriteThread wt(mc);

    std::cout << "Initializing threads..." << std::endl;

    if (!rt.start())
    {
        std::cout << "Unable to start read thread." << std::endl;
        return 1;
    }

    if (!wt.start())
    {
        std::cout << "Unable to start write thread." << std::endl;
        return 1;
    }

    std::cout << "Setting thread priority..." << std::endl;

    // play with these
    if (rt.setPriority(0, 0) == -1 || wt.setPriority(99, 1) == -1)
    {
        std::cout << "The given priority cannot be set." << std::endl;
    }
    else
    {
        std::cout << "Waiting..." << std::endl;
        yarp::os::Time::delay(5);
    }

    std::cout << "Stopping threads..." << std::endl;

    rt.stop();
    wt.stop();

    std::cout << "Exiting..." << std::endl;

    return 0;
}
cmake_minimum_required(VERSION 3.5 FATAL_ERROR)
project(yarp-threads LANGUAGES CXX)
find_package(YARP REQUIRED)
add_executable(main main.cpp)
target_link_libraries(main YARP::YARP_OS YARP::YARP_init)
PeterBowman commented 6 years ago

I'm unable to boost the write thread's priority so that the wait lock is minimized. Info:

It looks like the priority for SCHED_NORMAL (SCHED_OTHER?) cannot be other than 0. Also, changing to SCHED_FIFO requires sudo.

jgvictores commented 6 years ago

changing to SCHED_FIFO requires sudo

I guess we could play around with groups/permissions but yes, I ran into that sort of issue some years ago.

PeterBowman commented 6 years ago

Alternatively, use yarp::os::Time::yield() (same as yarp::os::Thread::yield()):

I guess we could try to put a yield() call in the body of the Thread::run() method. Moreover, this is what RateThread::run() (aka PeriodicThread) does internally according to the docs:

Note: after each run is completed, the thread will call a yield() in order to facilitate other threads to run.

See implementation.

PeterBowman commented 6 years ago

Since yarp::os::RateThread calls yield internally, this snippet should resemble more closely what CanBusControlboard is actually doing:

#include <atomic>
#include <iostream>
#include <limits>

#include <yarp/os/Mutex.h>
#include <yarp/os/Network.h>
#include <yarp/os/Thread.h>
#include <yarp/os/Time.h>

void doWork()
{
    int c = 0;

    for (int i = 0; i < 1000; i++)
    {
        c *= 2;
    }
}

struct MyClass
{
    MyClass() : concurrentReads(0) {}

    void read() const
    {
        double start = yarp::os::Time::now();
        ioReady.lock();
        concurrentReads++;
        doWork();
        std::cout << "read-wait: " << (yarp::os::Time::now() - start) << std::endl;
        ioReady.unlock();
    }

    void write() const
    {
        double start = yarp::os::Time::now();
        std::cout << "write-wait" << std::endl;
        concurrentReads = 0;
        ioReady.lock();
        std::cout << "write-wait: " << (yarp::os::Time::now() - start) << " (" << concurrentReads << ")" << std::endl;
        ioReady.unlock();
    }

    mutable yarp::os::Mutex ioReady;
    mutable std::atomic<int> concurrentReads;
};

struct ReadThread : public yarp::os::Thread
{
    ReadThread(const MyClass & _mc) : mc(_mc) {}

    virtual bool start()
    {
        std::cout << "Starting read thread..." << std::endl;
        return yarp::os::Thread::start();
    }

    virtual bool stop()
    {
        std::cout << "Stopping read thread..." << std::endl;
        return yarp::os::Thread::stop();
    }

    virtual void run()
    {
        while (!isStopping())
        {
            mc.read();
            yarp::os::Thread::yield();
            //double start = yarp::os::Time::now();
            //yarp::os::Time::delay(std::numeric_limits<double>::min());
            //std::cout << "delay-wait: " << (yarp::os::Time::now() - start) << std::endl;
        }
    }

    const MyClass & mc;
};

int main()
{
    std::cout << std::numeric_limits<double>::min() << std::endl;
    yarp::os::Network yarp;

    MyClass mc;
    ReadThread rt(mc);

    std::cout << "Initializing thread..." << std::endl;

    if (!rt.start())
    {
        std::cout << "Unable to start read thread." << std::endl;
        return 1;
    }

    for (int i = 0; i < 20; i++)
    {
        mc.write();
        yarp::os::Time::delay(0.25);
    }

    std::cout << "Stopping thread..." << std::endl;

    rt.stop();

    std::cout << "Exiting..." << std::endl;

    return 0;
}

I introduced an atomic variable (requires C++11) just to count how many times the read thread predates execution of the already waiting write thread. This should be zero, but in fact reaches even a few thousands (!). However, if I introduce a minimal delay, it behaves as expected.

To sum up, I guess everything boils down to the read thread consuming too many resources on its own. Running the previous snippet leads to a 80+% CPU usage on my PC, but with said delay it drops down to 20%. Of course, more code that we want to execute inside the fast read loop (even the stdout operations can pose a notable difference) translates to a bigger chance for the processor to "lend" the slower write loop some time slices and do its work. I'd like to compare current performance with that prior to #158, then introduce that small delay.

Note that even the tiniest delay must take into account system calls. For instance, it takes around 0.05 milliseconds to process the delay from the previous snippet on my PC. See https://stackoverflow.com/q/18071664.

Also, an interesting read about condition_variable: https://stackoverflow.com/a/10975146.

PeterBowman commented 6 years ago

Another note to self: check whether I can increment the read timeout, and how it affects CPU usage (is it equivalent to a sleep?).

PeterBowman commented 4 years ago

In case I ever come across this again: yarp::os::Time::delay(0); does not solve anything, it is best to add a minimum delay (std::numeric_limits<double>::min()) instead, as stated in the last example.

PeterBowman commented 4 years ago

Also, YARP's PeriodicThread class (former RateThread) steals 100% CPU time on a single read thread with tiny periods (anything more precise than 1 microsecond is not feasible): https://github.com/roboticslab-uc3m/yarp-devices/commit/d2d1da505a6d388d6699e6edbdeb55c299785e39.

Edit: see https://github.com/roboticslab-uc3m/yarp-devices/issues/231#issuecomment-575696298.