Xilinx / qemu

Xilinx's fork of Quick EMUlator (QEMU) with improved support and modelling for the Xilinx platforms.
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/821395464/QEMU+User+Documentation
Other
241 stars 154 forks source link

Memory corruption bug in fdt_generic_util.c #17

Closed Bill-Paul closed 8 years ago

Bill-Paul commented 8 years ago

Recently I decided to try building the Xilinx QEMU simulator so that I could experiment with the MPSoC virtual machine. Along the way I discovered a memory corruption bug in the code used to parse the dtb files. The reason I found this bug is because I tried to build the code on FreeBSD rather than Linux.

Getting the code to build on FreeBSD was a bit tricky in the first place because it seems it's only been developed/tested on Linux (it's not clear if you support Windows too), As a result, there are some Linux-isms and bitrot which have crept into the code. For example:

1) The build instructions on the Xilinx web site tell you to build only the aarch64 and microblaze simulators, Unfortunately the focus on just these two virtual machines has resulted in bitrot which prevents the other simulators for compiling. Try not restricting the built to those specific targets, and it will fall over in hw/acpi/pcihp.c due to a type mismatch.

2) The function memory_region_do_set_ram() in memory.c depends on Linux-specific APIs. In the official QEMU source tree this function is restricted to Linux using conditional compilation. I dealt with this temporarily by commenting out the call to qemu_ram_alloc_from_file() in this function. This routine doesn't seem to be necessary for the expected simulator use cases anyway.

3) qemu-nand-creator.c unconditionally uses lseek64(). This is a common cross-platform portability issue (I'm not sure if the fault lies with POSIX or Sun Microsystems).. But on FreeBSD, there is no lseek64() because lseek() already handles 64-bit offsets.

4) The added DTB parsing code has references to the function rawmemchr(). This function is a GNU libc extension that doesn't exist on other platforms.

I managed to overcome these issues and got the code to build on my FreeBSD/amd64 9.2-RELEASE system. Then I downloaded the zcu102-arm.dts file, compiled it with the dtc tool, and tried to run the aarch64 simulator using the command:

qemu-system-aarch64 -M arm-generic-fdt -hw-dtb zcu102-arm.dtb

Unfortunately the simulator crashed during initialization.

It turns out this is due to the following:

In hw/core/fdt_generic_util.c, you have the following code:

[...] typedef struct QEMUIRQSharedState { qemu_irq sink; int num; bool (_mergefn)(bool *, int); / FIXME: remove artificial limit */

define MAX_IRQ_SHARED_INPUTS 16

bool inputs[MAX_IRQ_SHARED_INPUTS];

} QEMUIRQSharedState; [...]

[...] static void qemu_irq_shared_handler(void opaque, int n, int level) { QEMUIRQSharedState s = opaque;

s->inputs[n] = level;
qemu_set_irq(s->sink, s->merge_fn(s->inputs, s->num));

} [...]

Notice that the inputs[] array is 16 elements in size. It turns out however that when using the zcu102-arm.dts machine description, the number of possible 'inputs' is much grater than 16. It's more like 48 or so. However if qemu_irq_shared_handler(() is value for n which is greater than 15, then this line of code causes heap corruption due to writing outside the array bounds:

s->inputs[n] = level;

It happens that this heap corruption does not crash the simulator on Linux, but this is just out of sheer luck. On FreeBSD, it causes some of the IRQState structures to become corrupted such that the merge_fn pointer is invalid, and the program eventually tries to branch to into the weeds.

I fixed this by changing

define MAX_IRQ_SHARED_INPUTS 16

to

define MAX_IRQ_SHARED_INPUTS 256

I don't know what the shared input limit is, but 256 slots seems to be ok for now.

Sorry I don't have any patches for you, but the fix for the heap corruption is a one-liner as you can see above.

alistair23 commented 8 years ago

Hello Bill-Paul,

First of all I'm glad you decided to use QEMU I'm sorry you had difficulties building and running it. Internally we build, test and develop on Linux machines. Which is why there is little to no consideration on running on BSD or Windows hosts. Recently we have started adding some support for Windows hosts, although this is limited at the moment.

We know the code base is starting to age on targets and devices that we don't use and we are working on fixing this. Hopefully we will have a newer QEMU release later in the year.

  1. I completely agree with you. I inherited this inability to build other platforms and just haven't had the time to spend on fixing it. We are working on improving the QEMU code though.
  2. I'm sorry about that. Hopefully when we update our tree based on a newer mainline version this will be fixed.
  3. Do you want to write a patch for this?
  4. I can look into this and see what I can do to fix the problem.

Wow! Good catch. I can't believe we haven't seen this before. Thanks for pointing this out. I will write-up a fix.

Thanks for the feedback, we are always happy to hear it. We are trying hard to improve QEMU and any input is appreciated.

Thanks, Alistair

Bill-Paul commented 8 years ago

Regarding the lseek64()/lseek() thing, sadly I don't really have a patch, because I'm not exactly certain what the correct fix would be. For now I changed the source to just use lseek() so that it would build on FreeBSD. I was told the correct way to address this is:

include

...

if (_POSIX_V6_ILP32_OFF32 > 0) || (_POSIX_V7_ILP32_OFF32 > 0)

...lseek64(...

else

...lseek(...

endif

But I'm not 100% sure of this.

It's possible that for the other machine types the number of inputs never exceeds 15, which means it could only be an issue with the MPSoC simulation. You can see it by adding a printf() in qemu_irq_shared_handler() to display the value of 'n' each time it's called. With the zcu102 machine description it definitely shows values larger than 15. I built the code on an Ubuntu machine too just to make sure.

Anyway, with the fixes in place, I was able to run the simulator on my FreeBSD host and boot the pre-compiled ZCU102 U-Boot and Linux images, so I'm good for now. I have been using QEMU to for VxWorks development and debugging for quite some time and it's always come in very handy, so I was very interested in using it for the Xilinx platforms too.

-Bill

alistair23 commented 8 years ago

Great! I'm glad it works for you after those issues. I'll path the memory issue early next week and then I'll try to look into the lseek problem as well. Thanks, Alistair