snabblab / snabblab-nixos

NixOS configuration for the Snabb Lab
59 stars 17 forks source link

Transient failures in dpdk benchmark #52

Open domenkozar opened 8 years ago

domenkozar commented 8 years ago

As observed in matrix-packet-ABC , dpdk/l2fwd benchmark sometimes timeouts (traffic flows but suddenly stops).

Using qemu logging all dmesg+process output branch I've managed reproduce.

No ideas what goes wrong so far. cc @eugeneia

domenkozar commented 8 years ago

I see the main difference is

successful run

[   12.410155] dpdk-start[586]: EAL: Detected 1 lcore(s)
[   12.411343] dpdk-start[586]: EAL: VFIO modules not all loaded, skip VFIO support...
[   12.411958] dpdk-start[586]: EAL: Setting up physically contiguous memory...
...
[   12.949353] dpdk-start[586]: EAL: Master lcore 0 is ready (tid=36b568c0;cpuset=[0])
[   12.949954] l2fwd[694]: EAL: TSC frequency is ~3499994 KHz

failured run

[   11.416015] dpdk-start[586]: EAL:   unsupported IOMMU type!
[   11.416565] dpdk-start[586]: EAL: VFIO support could not be initialized
[   11.417133] dpdk-start[586]: EAL: Setting up memory...
...
[   11.953631] dpdk-start[586]: EAL: Master core 0 is ready (tid=28d70840)
[   11.954151] dpdk-start[586]: EAL: Error - exiting with code: 1
domenkozar commented 8 years ago

cc @lukego for possible ideas :)

eugeneia commented 8 years ago

IOMMU strikes again! Cc @vanfstd

domenkozar commented 8 years ago

This happens around 10-20% of the time in case benchmark fails: https://github.com/snabbco/snabb/issues/960#issuecomment-232027967