thesofproject / sof

Sound Open Firmware
Other
514 stars 303 forks source link

[HELP] Trying to setup qemu emulators for development environment #3007

Open crazoes opened 4 years ago

crazoes commented 4 years ago

Steps undertaken:

Build toolchain

Following instructions on https://thesofproject.github.io/latest/getting_started/build-guide/build-from-scratch.html#id9

Build firmware

./scripts/xtensa-build-all.sh byt

Build qemu

./configure --prefix=. --target-list=xtensa-softmmu,x86_64-softmmu --enable-gtk --enable-sdl --enable-spice --audio-drv-list=alsa --enable-libusb --enable-usb-redir --enable-coroutine-pool --disable-opengl --enable-fdt
make

Build kernel (sof's fork)

make defconfig
./kconfig/kconfig-sof-default.sh

# enable following VIRTIO config options
CONFIG_VIRTIO_BLK=y
CONFIG_VIRTIO_NET=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_VIRTIO_PCI=y

Build rootfs image

  1. Download base image from yocto
  2. Create a mount point
    mkdir host/
  3. Mount the image
    sudo mount core-image-sato-qemux86-64.ext4 host/
  4. Install the modules on rootfs
    sudo make modules_install INSTALL_MOD_PATH=host
  5. Copy firmware files to rootfs
    sudo cp sof/build_byt_gcc/sof-byt.ri host/lib/firmware/intel/sof/sof-byt.ri
  6. Unmount
    sudo umount host/

Booting xtensa-host

./xtensa-host.sh byt

Booting host

./x86-host.sh byt ../linux/arch/x86/boot/bzImage ../core-image-minimal-dev-qemux86-64.ext4

Error / Problem

As per alsa wiki successful boot should look something like this:

000502:904473: shim.io: write 0x60 =    0x00000001 (old 0x0)    (00000001)      |....|
000502:904477: irq: SHIM_IMRLPESC masking 1 mask 1 active 0
000502:904493: trace.io: timestamp 0x00000643 delta 0x00000095 000502:904494: ipc SNo
000502:904508: shim.io: read at 74 val 0x400000c2
000502:904511: shim.io: write 0x74 =    0x000000c2 (old 0x400000c2)     (00000194)      |....|
000502:904515: shim.io: read at 60 val 0x00000001

However, what I see on xtensa-host and x86-host are as follows:

xtensa-host

x86-host

On the host, I can see the following error messages when grepped on dmesg error

If anyone can suggest how to solve this or if you spot any errors in the process that'd be great.

paulstelian97 commented 4 years ago

@dianacretu Do you have any tips? I can't seem to find the proper documentation?

dbaluta commented 4 years ago

@dianacretu @paulstelian97 not sure we can help with that because we only looked at DSP emulation so far. I remember that some time ago when I looked at running the emulation on Intel platforms I got stuck at the same problem.

@ranj063 @lgirdwood should we try with other Intel platform? What is the most stable platform to experiment with qemu?

crazoes commented 4 years ago

Is there any alternate setup for development and testing? Do you have a different setup that doesn't require qemu @paulstelian97 ?

I've spent past week trying to complete setup and start with the my outreachy project.

I've seen @ktrzcinx working on improving debugging as well. Maybe he can tell us what setup he has for development?

paulstelian97 commented 4 years ago

@crazoes I generally test on actual boards, and we have at NXP an internal setup that allows remote access to time shared boards on which I can run an arbitrary kernel + dtb (ARM) + firmware combinations for testing. I'm not sure that you have this option though, that's why qemu is recommended. And I tagged Diana because she is adding support for the NXP specific hardware in qemu.

ktrzcinx commented 4 years ago

Well, as usual I'm also using company internal setup for build and testing. Sometimes for tests on full stack I use real hardware. So I'm not a QEMU user actually. It's rather question for folks who maintain CI.

lgirdwood commented 4 years ago

@crazoes your flow looks corrct, but sometime we have seen issues with stale posix SHM and MQs in the past. Can you try and delete /dev/shm/qemu- and /dev/mqueue/qemu-

You should be able to confirm that the correct IPC registers and memories are correct via checking the SHM e.g.

hexdump -c /dev/shm/qemu-<name of memory> | less

Interrupts (and reset control) are sent via POSIX message queue, so the FW should send a "FW ready" message IRQ to the host as a MQ (I can see this in the log), and it looks like the host does get the message.

It does look like the ACPI/PCI IRQ number being used by the virtualised host driver is wrong. I have seen this before with different qemu releases. dmesg may show "irq: nobody cared" message as the code will emit an IRQ or the IRQ will be handled by another driver (with nothing in dmesg).

cat /proc/interrupts

Will show the current IRQs for each device, I would assume the count for one driver will increment by one when the DSP has booted. This will need to be mapped to the DSP by changing the value in qemu ACPI table code.

dbaluta commented 4 years ago

@crazoes I have tried your instructions on my machine and it seems that the firmware can boot correctly.

Here is the log for sof that I get:

[    0.176321] Performance Events: unsupported p6 CPU model 15 no PMU driver, software events only.
[    2.016177] sof-audio-acpi 80860F28:00: ACPI DSP detected
[    2.016903] sof-audio-acpi 80860F28:00: IMR not set by BIOS. Ignoring
[    2.018056] sof-audio-acpi 80860F28:00: Force to use nocodec mode
[    2.019292] sof-audio-acpi 80860F28:00: Using nocodec machine driver
[    2.021236] sof-audio-acpi 80860F28:00: loading firmware
[    2.022901] sof-audio-acpi 80860F28:00: request_firmware intel/sof/sof-byt.ri successful
[    2.024413] sof-audio-acpi 80860F28:00: firmware doesn't contain extended manifest
[    2.025863] sof-audio-acpi 80860F28:00: header size=0x14fcc modules=0x1 abi=0x1 size=16
[    2.030774] sof-audio-acpi 80860F28:00: new module size 0x14fc0 blocks 0xf type 0x0
[    2.031429] sof-audio-acpi 80860F28:00: block 0 type 0x1 size 0x16c ==>  offset 0xc0000
[    2.032043] sof-audio-acpi 80860F28:00: block 1 type 0x1 size 0x16c ==>  offset 0xc0400
[    2.032628] sof-audio-acpi 80860F28:00: block 2 type 0x1 size 0x8 ==>  offset 0xc05bc
[    2.033168] sof-audio-acpi 80860F28:00: block 3 type 0x1 size 0x8 ==>  offset 0xc05dc
[    2.033710] sof-audio-acpi 80860F28:00: block 4 type 0x1 size 0x4 ==>  offset 0xc05fc
[    2.034248] sof-audio-acpi 80860F28:00: block 5 type 0x1 size 0x4 ==>  offset 0xc061c
[    2.034827] sof-audio-acpi 80860F28:00: block 6 type 0x1 size 0x8 ==>  offset 0xc063c
[    2.035369] sof-audio-acpi 80860F28:00: block 7 type 0x1 size 0x4 ==>  offset 0xc0658
[    2.035909] sof-audio-acpi 80860F28:00: block 8 type 0x1 size 0x18 ==>  offset 0xc065c
[    2.036509] sof-audio-acpi 80860F28:00: block 9 type 0x1 size 0x8 ==>  offset 0xc067c
[    2.037117] sof-audio-acpi 80860F28:00: block 10 type 0x1 size 0x11810 ==>  offset 0xc06c0
[    2.038194] sof-audio-acpi 80860F28:00: block 11 type 0x2 size 0x21c8 ==>  offset 0x100008
[    2.038770] sof-audio-acpi 80860F28:00: block 12 type 0x2 size 0x1c ==>  offset 0x1021d0
[    2.039341] sof-audio-acpi 80860F28:00: block 13 type 0x2 size 0x1068 ==>  offset 0x1021f0
[    2.039895] sof-audio-acpi 80860F28:00: block 14 type 0x2 size 0x194 ==>  offset 0x1070b8
[    2.040479] xxx: snd_sof_run_firmware ..
[    2.049362] sof-audio-acpi 80860F28:00: booting DSP firmware
[    2.076794] sof-audio-acpi 80860F28:00: ipc: DSP is ready 0x70000000 offset 0x144000
[    2.077429] sof-audio-acpi 80860F28:00: Firmware info: version 1:5:0-e9cb8
[    2.077888] sof-audio-acpi 80860F28:00: Firmware: ABI 3:16:0 Kernel ABI 3:16:0
[    2.078524] sof-audio-acpi 80860F28:00: found ext header type 1 size 0xa0
[    2.078982] sof-audio-acpi 80860F28:00: found ext header type 2 size 0x5c
[    2.079389] sof-audio-acpi 80860F28:00: Firmware info: used compiler XCC 12:0:8 xtensa-byt-elf used optimization flags Os
[    2.080038] sof-audio-acpi 80860F28:00: found ext header type 3 size 0x1c
[    2.080444] sof-audio-acpi 80860F28:00: warning: unknown ext header type 3 size 0x1c
[    2.081009] sof-audio-acpi 80860F28:00: found ext header type 4 size 0x10
[    2.081415] sof-audio-acpi 80860F28:00: warning: unknown ext header type 4 size 0x10
[    2.082017] sof-audio-acpi 80860F28:00:  mailbox upstream 0x144000 - size 0x400
[    2.082554] sof-audio-acpi 80860F28:00:  mailbox downstream 0x144400 - size 0x400
[    2.083100] sof-audio-acpi 80860F28:00:  stream region 0x144a00 - size 0x200
[    2.083565] sof-audio-acpi 80860F28:00: firmware boot complete
[    2.588429] sof-audio-acpi 80860F28:00: error: ipc timed out for 0x90030000 size 80
[    2.590084] sof-audio-acpi 80860F28:00: info: preventing DSP entering D3 state to preserve context
[    2.592416] sof-audio-acpi 80860F28:00: error: unexpected fault 0x70028800 trace 0x00000000
[    2.594100] sof-audio-acpi 80860F28:00: error: ipc host -> DSP: pending no complete no raw 0x4000000000000000
[    2.595890] sof-audio-acpi 80860F28:00: error: mask host: pending no complete no raw 0x0
[    2.597527] sof-audio-acpi 80860F28:00: error: ipc DSP -> host: pending no complete yes raw 0x70028800
[    2.598855] sof-audio-acpi 80860F28:00: error: mask DSP: pending no complete no raw 0x0
[    2.599401] sof-audio-acpi 80860F28:00: error: can't set params for DMA for trace -110
[    2.599944] sof-audio-acpi 80860F28:00: warning: failed to initialize trace -110
[    2.600535] sof-nocodec sof-nocodec: info: override BE DAI link NoCodec-0
[    2.600947] sof-nocodec sof-nocodec: info: override BE DAI link NoCodec-1
[    2.601354] sof-nocodec sof-nocodec: info: override BE DAI link NoCodec-2
[    2.602190] sof-audio-acpi 80860F28:00: Topology: ABI 3:16:0 Kernel ABI 3:16:0
[    2.602784] sof-audio-acpi 80860F28:00: warning: widget type 1 name codec_out0 not handled
[    3.108431] sof-audio-acpi 80860F28:00: error: ipc timed out for 0x30010000 size 76
[    3.110091] sof-audio-acpi 80860F28:00: info: preventing DSP entering D3 state to preserve context
[    3.112384] sof-audio-acpi 80860F28:00: error: unexpected fault 0x70028800 trace 0x00000000
[    3.114018] sof-audio-acpi 80860F28:00: error: ipc host -> DSP: pending no complete no raw 0x4000000000000000
[    3.115814] sof-audio-acpi 80860F28:00: error: mask host: pending no complete no raw 0x0
[    3.117465] sof-audio-acpi 80860F28:00: error: ipc DSP -> host: pending no complete yes raw 0x70028800
[    3.118723] sof-audio-acpi 80860F28:00: error: mask DSP: pending no complete no raw 0x0
[    3.119264] sof-audio-acpi 80860F28:00: error: DSP failed to add widget id 0 type 23 name : PCM0P stream Low Latency Playback 0 reply 0
[    3.119919] sof-audio-acpi 80860F28:00: ASoC: failed to load widget PCM0P
[    3.120322] sof-audio-acpi 80860F28:00: error: tplg component load failed -110
[    3.120847] sof-audio-acpi 80860F28:00: error: failed to load DSP topology -22
[    3.121370] sof-audio-acpi 80860F28:00: ASoC: failed to probe component -22
[    3.121788] sof-nocodec sof-nocodec: ASoC: failed to instantiate card -22
[    3.122225] sof-nocodec: probe of sof-nocodec failed with error -22

I do get an error when trying to load the topology, which I'm now investigating. Not sure what could be the problem on your side. Do you have the kvm module installed?

dbaluta commented 4 years ago

@crazoes OK, so i did more research. It seems that I can reproduce your exact trace when CONFIG_TRACEM is selected when compiling the firmware.

So, we have now two ways of continuing:

1) With CONFIG_TRACEM selected I think the DSP crashes somewhere. Maybe some memory allocation is not enough. 1) Without CONFIG_TRACEM selected there is a problem with loading the topology. But what I think it happens is that SOF firmware (sof-byt.ri) crashes somewhere. :)

I think this is a good start for investigation.

crazoes commented 4 years ago

@dbaluta strange but I don't have CONFIG_TRACEM set. I am still not able to load the firmware ( not sure why it is loading for you ). How are you compiling your firmware? Using script or manually?

dbaluta commented 4 years ago

@crazoes first time i compile the .ri using the script.

Then to modify CONFIG_TRACEM in sof repo I use make menuconfig, and from now on I only run 'make -j8' to compile the firmware.

So, can you try both with CONFIG_TRACEM set and unset to see what's the result? Also, make sure you have kvm enabled.

crazoes commented 4 years ago

@dbaluta I do have kvm enabled as per this line https://github.com/thesofproject/qemu/blob/sof-v4.2/x86-host.sh#L79

Also, I tried to set and unset CONFIG_TRACEM but got the same result both the time. No change in the output. I can still see firmware boot failure.

I do the same process as you did to compile the firmware ( first using script then make menuconfig ) If I am not wrong, CONFIG_TRACEM is the Trace mailbox.

dbaluta commented 4 years ago

@crazoes Ok, thanks. I couldn't figure what's the issue. So lets take it step by step. Can you first try to see if SOF firmware boots with the fuzzer? I have forwarded you an email on how to start the fuzzer.

lgirdwood commented 4 years ago

@crazoes please try and run the xtensa-host.sh with -t to debug boot issues. You can also use the tracing script too, please apply this update

diff --git a/xtensa-trace.sh b/xtensa-trace.sh
index 78fcc0fcf5..c1df905044 100755
--- a/xtensa-trace.sh
+++ b/xtensa-trace.sh
@@ -9,12 +9,38 @@
 #
 # 2) Now run this script.
 #
+# ./xtensa-trace.sh zephyr.lst [bootloader.lst] [rom.lst]
+#

 # Find all xtensa function "entry" instructions.
-grep entry dump.txt > entry.txt
+grep 'entry\|retw' dump.txt > entry.txt
+cat $1 $2 $3 > dis.txt
+
+let indent=1

 # Now find the function name for each "entry" instruction address.
 while read input
 do
-       grep `echo $input | cut -f1 -d: | cut -c 3-10` /home/lrg/work/zephyrproject/zephyr/build/zephyr/zephyr.lst | grep -v call | grep -v entry -m 1
+       #echo -n "searching for function at "
+       #echo $input | cut -f1 -d: | cut -c 3-10
+
+       for (( c=1; c<=indent; c++ ))
+       do
+               echo -n " "
+       done
+
+       #echo "found"
+       #grep `echo $input | cut -f1 -d: | cut -c 3-10` dis.txt | grep -v call | grep -v entry -m 1
+       grep `echo $input | cut -f1 -d: | cut -c 3-10` dis.txt | grep '>:'
+       if [ $? -eq 0 ];
+       then
+               let indent++
+       fi
+
+       grep  `echo $input | cut -f1 -d: | cut -c 3-10` dis.txt | grep 'retw'
+       if [ $? -eq 0 ];
+       then
+               let indent--
+       fi
+
 done < entry.txt
crazoes commented 4 years ago

@lgirdwood There are couple of things that we tried to solve the boot problem. I would like to list them here for your information

  1. Increasing the boot timeout value under sound/soc/sof/loader.c sdev->boot_timeout = 10000; We thought maybe dsp doesn't get enough of booting time so we tried to increase this value but nothing changed with it.

  2. Printing the IRQ number ( byt_acpi_probe function inside sound/soc/sof/intel/byt.c file) dev_info(sdev->dev, "using IRQ %d %d\n", sdev->ipc_irq, desc->irqindex_host_ipc); Following is the image of dmesg output for it. image

Also, please see the below image for cat /proc/interrupts output.

image

We don't even get any irq: nobody cared message as well.

  1. Initially IRQ 11 which has AudioDSP was being shared with usb so we changed the x86-host.sh script to remove usb parameter but that also changed nothing.

@dbaluta, Please add if I am forgetting something to mention here.

I'll send you the output that I get using -t option with xtensa-host.sh soon.

dbaluta commented 4 years ago

@lgirdwood @crazoes I don't think there is any problem with Qemu implementation. On my side, I successfully booted the DSP but then got a timeout when loading the topology.

Lets try the debug script @lgirdwood mentioned.

lgirdwood commented 4 years ago

@crazoes I would expect to see the IRQ count increment, the mapping looks correct, but it may be that qemu APIs have been updated wrt IRQs on x86, maybe we are missing something. Can you check the other PCI qemu drivers for IRQ logic. @dbaluta the SHM memory dumps will also help here, you could divert all trace to the "trace error" region (no DMA) and then use sof-logger to read the SHM file, but I agree emulator is far slower so may timeout.

crazoes commented 4 years ago

@dbaluta I tried the script given by Liam.

dump.txt :- https://gist.github.com/crazoes/d6115a0a8f655d0f2f64826f3b4138a1 entry.txt :- https://gist.github.com/crazoes/ada9c73237058cf9e97e380acfc7436d

I don't really understand much so will need your help with it.

dbaluta commented 4 years ago

@crazoes dump.txt looks OK. It shows the firmware idle waiting for an interrupt. Lets debug this together in our weekly meeting.

ymdatta commented 3 years ago

Hi @dbaluta and @crazoes : Was this issue solved and the process documented somewhere? I had the same problem a few months ago, but didn't went into further details as my project didn't need this.

lgirdwood commented 3 years ago

Fwiw @lyakh will be updating the qemu docs here as he is debugging some Zephyr bringup.

crazoes commented 3 years ago

@ymdatta we were not really able to solve this problem. We suspected some of the qemu APIs might have changed over these years and hence communication was not happening correctly.