bitcraze / crazyflie-link-cpp

MIT License
6 stars 8 forks source link

Bootloader very slow when using cpp link #10

Open ataffanel opened 3 years ago

ataffanel commented 3 years ago

When using the cpp link, bootloading a Crazyflie is very slow:

$ time make cload
Python:
real    0m31.702s
user    0m3.389s
sys     0m1.176s
CPP:
real    4m42.991s
user    0m7.673s
sys     0m18.203s

The behavior seems to come from send packet that can either take 1 seconds or run in quick bursts of ~30 packets. This looks like mutex contension so I tried commenting this mutex lock: https://github.com/bitcraze/crazyflie-link-cpp/blob/e6043eb5025325b720b410b5f50e4901c86e2860/src/CrazyradioThread.cpp#L227. Doing so solves the problem, I can then bootloader as fast as with the python link.

Why does this mutex exists? I though there was only one thread accessing each radio which means that radio access would not need to be protected by a lock. I will continue investigating, tell me if you have any though @whoenig.

jonasdn commented 3 years ago

If we try to avoid holding the mutex across radio.sendPacket:

From aa226b89b569567ab465324d3ae2f2c408b18958 Mon Sep 17 00:00:00 2001
From: Jonas Danielsson <jonas@bitcraze.io>
Date: Tue, 11 May 2021 09:57:27 +0200
Subject: [PATCH] wip-mutex-foo
​
---
 src/CrazyradioThread.cpp | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)
​
diff --git a/src/CrazyradioThread.cpp b/src/CrazyradioThread.cpp
index 9a8cad2..d8f840b 100644
--- a/src/CrazyradioThread.cpp
+++ b/src/CrazyradioThread.cpp
@@ -224,21 +224,31 @@ void CrazyradioThread::run()
                 }
             } else {
                 // no safelink
-                const std::lock_guard<std::mutex> lock(con->queue_send_mutex_);
                 if (!con->queue_send_.empty())
                 {
-                    const auto p = con->queue_send_.top();
-                    ack = radio.sendPacket(p.raw(), p.size());
-                    ++con->statistics_.sent_count;
-                    if (ack)
+                    Packet p;
                     {
+                        const std::lock_guard<std::mutex> lock(con->queue_send_mutex_);
+                        p = con->queue_send_.top();
                         con->queue_send_.pop();
+                        ++con->statistics_.sent_count;
+                    }
+
+                    ack = radio.sendPacket(p.raw(), p.size());
+
+                    if (!ack)
+                    {
+                        const std::lock_guard<std::mutex> lock(con->queue_send_mutex_);
+                        con->queue_send_.push(p);
                     }
                 }
                 else if (con->useAutoPing_)
                 {
                     ack = radio.sendPacket(ping, sizeof(ping));
-                    ++con->statistics_.sent_count;
+                    {
+                        const std::lock_guard<std::mutex> lock(con->queue_send_mutex_);
+                        ++con->statistics_.sent_count;
+                    }
                 }
             }

-- 
2.30.2    

Then the difference seem to go away:

CPP:
real    0m28.404s
user    0m0.950s
sys     0m0.722s

Python:
real    0m31.527s
user    0m3.362s
sys     0m0.911s

Is this a safe change?

whoenig commented 3 years ago

I am still a bit surprised by the very large time difference. Are we sure that the Python bindings are actually build in release mode?

ataffanel commented 3 years ago

This is happening with the release build from PyPi which should be release. Furthermore it does seems to be fixed in #11 in debug or release mode. This really behaved like lock contention (ie. random throughput) and not like a slow debug build.