Open felipetrentin opened 2 months ago
(but probably in not a very good way lol)
it turns out that ´rhsp_getInterfacePacketID´ does a transaction with the hub in the following order:
the code was failing in step 2 where it would check if there was an interface after querying.
more especifically, the code was failing in getInterfaceByName
where there is a string comparison. This could mean a couple of things:
getInterfaceByName
was wrong and was accessing non allocated memory;First of all, in getInterfaceByName
there is a comparison between int
and size_t
that could yield an infinite loop. Thus, i changed the function to be:
static const RhspModuleInterface* getInterfaceByName(const RhspModuleInterfaceListInternal* list,
const char* interfaceName)
{
for (size_t i = 0; i < list->numberOfInterfaces; i++)
{
if (strcmp(interfaceName, list->interfaces[i].name) == 0)
{
return &list->interfaces[i];
}
}
return NULL;
}
take note that if the value of list->numberOfInterfaces
is wrong, the loop will try accessing an illegal part of the memory.
as the bug was still not fixed, I focused in understanding what was happening inside rhsp_queryInterface
, since a wrong value inside this function could lead to an invalid data structure that would in turn fail when processed by getInterfacePacketIdInternal
my first suspicion was that some string could be missing a null terminator, causing string operators to fail. thus, i added this line after the allocation of memory to intf_.name
and memcpy
intf_.name[interfaceNameLength-1] = 0;
this also did not fix the issue, despite not breaking anything else.
Therefore, the only other suspicion was that some garbage value was being stored in list->numberOfInterfaces resulting in a out of bounds memory acces in getInterfaceByName
. The question was "where is this value set and changed, and what could make it out of bounds?"
then I turned my focus to rhsp_queryInterface
where this value probably was last changed before failing. Upon further inspection, I found that it called addInterface
as I had already confirmed that the values set before the calling of this function were right, it was the main culprit for the segmentation fault. It turns out, that if hub->interfaceList
is a null pointer, the function allocates its memory but never changes the values. The code probably worked using other compilers, but running Colcon and, in the background, GCC, makes such that malloc
only allocates memory and does not clean it.
so it was possible that numberOfInterfaces
was not zero after allocation.
so, to test if that was the case i added the following line after the allocation:
((RhspModuleInterfaceListInternal*) hub->interfaceList)->numberOfInterfaces = 0;
And it worked!! but the code is still unsafe. There are a lot of arguments that could result in big accidents like these. This is still not good enough for a competition-ready robot running ROS2. I hope this brought some light to the issue and maybe helped someone. I would be interested in contributing to this repository. I would also suggest to make librhsp separate from this repository, since it has very different goals in mind.
Wow! Thank you for reporting and investigating this bug. We're not actively working on this library at the moment, but we do plan to return to it, and I'm sure we'll want to dive into this then.
Moving librhsp
to its own repo is something we want to do at some point (there's some historical reasons why that's not currently the case).
My configuration:
What I'm trying to accomplish:
I'm trying to use a REV Expansion Hub to control a robot using ROS2 Humble by creating a custom ROS node in C++
I was able to connect to the hub and change the LED color, by using
rhsp_setModuleLedColor
by using the library tests as an example.The problem:
rhsp_setServoConfiguration
,rhsp_setMotorChannelMode
,rhsp_getEncoderPosition
and other commands give the same segmentation fault.it appears as any command containing
rhsp_getInterfacePacketID
creates the same behavior, apparently caused by thestrcmp
ingetInterfaceByName
debugging in GDB i get the following output:
Thanks in advance.