Silent but deadly bug in determine_input_device() function

GoogleCodeExporter commented 9 years ago

Hello,

Thanks for writing a great Linux keylogger. It has been very educational for me 
to read through the code, to understand how to read input from any keyboard 
attached to the system, regardless of which program has input focus. This is 
very useful for embedded systems, where you need a simple method to control a 
device, e.g. by plugging a USB keyboard into a hidden USB port.

In the logkeys-0.1.1a source code, I happened to notice what appears to be a 
deadly bug, just waiting to blow up.

On line 349 of logkeys.cc, you have the following code:

args.device = const_cast<char*>(input_dev_index.str().c_str());

If we analyze what is happening here, we see that:

1) input_dev_index is a std::stringstream object, created on the stack, and 
therefore will be destroyed when the function returns.
2) Calling input_dev_index.str() returns a possibly temporary std::string 
object.
3) Calling input_dev_index.str().c_str() returns a char pointer, pointing to 
the data in the contents of the temporary input_dev_index object (or a 
temporary std::string object).
4) When the input_dev_index variable destructs when it goes out of scope, any 
char pointers to the stringstream data will be invalid. Later, when new memory 
is allocated, the new memory blocks may occupy the bytes formerly occupied by 
the stringstream data. Accessing args.device will of course read from the 
undefined memory contents, and your program will have unpredictable behaviour.

Your program should already be producing unpredictable behaviour. The fact that 
it isn't is possibly because you are not allocating any new memory after 
calling the determine_input_device() function. However, if this changes in the 
future, you will see problems.

Suggested solutions:
1) Create a global std::string gInputDevice object. Copy the contents of 
input_dev_index into the string [gInputDevice = input_dev_index.str()] and then 
set args.device = gInputDevice.c_str(). This will work because the string will 
not be destroyed when determine_input_device() returns and the args.device 
pointer will remain valid.
2) Instead of using char pointers in the args structure, use char arrays. Copy 
the actual string values into the structure using strcpy(). This is the most 
robust solution, because then you can have multiple copies of the args 
structure, each with different values. In the existing code, if you have 
multiple copies, changing the contents of one of the char arrays pointed to 
will cause all copies of args to be affected.

Hope this helps. Keep up the great work.

Regards,
Markus.

Original issue reported on code.google.com by msvil...@gmail.com on 2 Aug 2010 at 3:35

GoogleCodeExporter commented 9 years ago

Hi,

Thank you very much.

The code was first crafted as a single main() function, and was only later 
broken into logical functions. /me n3wb. I guess it shows. :)
I think I was wondering myself how it works regardless of variable going out of 
scope, but it worked reliably. (If it wasn't this segment I had in mind, then 
there must be some other.) I think I understand the problem now, though.

I shall do as you suggested in 1), because I see no reason for "multiple copies 
of the args struct." It is only a single instance holding needed values 
globally available to program functions. Furthermore, should I do 2), then I 
would have to preassign enough space for these paths (e.g. char 
device[PATH_MAX];), which is IMHO a bit redundant.

Thanks again! :-)

Original comment by kernc...@gmail.com on 2 Aug 2010 at 4:05

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

My pleasure :-)

Just a thought -- you could also change the char * fields in args to 
std::string. This way, you wouldn't have to preallocate large char arrays to 
ensure you can store anything the user might want to stuff in there.

Benefits of using std::string are two-fold:
- easy
- unlimited length of argument values (well, limited by available memory only)
- able to have multiple independent copies of args structure

I know you said you don't expect to have multiple copies of args, but in my 
experience it's always good to think ahead for these kinds of cases, especially 
if it takes a trivial amount of effort. The code can grow and change in ways 
you would never predict, and making it bullet proof early can save you lots of 
time down the road. Anyway, it's totally your call, just sharing my 2 cents :-)

Potential disadvantages:
- overhead of using std::string and dynamic allocations (these would be trivial 
in the case of logkeys, so not really a disadvantage)

Also, if you end up deciding to make logkeys support multiple keyboards, each 
thread monitoring a keyboard may need its own copy of args, due to unique log 
file name assigned to each keyboard, or other reasons. Having a truly copy-able 
args structure would help in this case.

Regards,
Markus.

Original comment by msvil...@gmail.com on 2 Aug 2010 at 4:15

GoogleCodeExporter commented 9 years ago

args' string members are now strings since r73.

thanks. ;)

Original comment by kernc...@gmail.com on 19 Aug 2010 at 4:46

Changed state: Fixed

JackieXie168 / logkeys

Silent but deadly bug in determine_input_device() function #37