medvedv / purifier

BSD 3-Clause "New" or "Revised" License
41 stars 10 forks source link

Kernel errors after run #11

Open debian89 opened 7 years ago

debian89 commented 7 years ago

Hello,

When I try to run purifier proccess, its crashed with followed kernel messages:

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... D: bnxt_rte_pmd_init() called for (null)

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: PCI device 0000:01:00.0 on NUMA socket -1

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: probe driver: 8086:1572 rte_i40e_pmd

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... D: eth_i40e_dev_init(): FW 4.33 API 1.2 NVM 04.04.01 eetrack 80001869

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: PCI device 0000:01:00.1 on NUMA socket -1

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: probe driver: 8086:1572 rte_i40e_pmd

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: PCI device 0000:01:00.2 on NUMA socket -1

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: probe driver: 8086:1572 rte_i40e_pmd

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... D: eth_i40e_dev_init(): FW 4.33 API 1.2 NVM 04.04.01 eetrack 80001869

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: PCI device 0000:01:00.3 on NUMA socket -1

Message from syslogd@dx602-s16 at Apr 5 05:12:29 ... L: probe driver: 8086:1572 rte_i40e_pmd

Kernel version is: 4.4.55

medvedv commented 7 years ago

Hi,

Show /var/log/purifier.log content please.

debian89 commented 7 years ago

Hello,

Here is the output:

EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed EAL: Error - exiting with code: 1 Cause: Invalid lcores count EAL: Error - exiting with code: 1 Cause: Invalid lcores count EAL: Error - exiting with code: 1 Cause: TCP_hash_create on core 2 failed

medvedv commented 7 years ago

Hi,

In case Cause: TCP_hash_create on core 2 failed you have not enough memory For every worker lcore (total cores - 2) it is recommended to have at least 1.5 Gb memory. Or you can decrease size of hash table for states

define PRF_TCP_CONN_HASH_SIZE (1 << 22) <-- define in prf_stateful.h

In case Cause: Invalid lcores count you have not enough lcores. Purifier needs to have at least 3 cores (2 + 1 worker).

debian89 commented 7 years ago

Hello, But the server is with 8 cores and 8 gb ram?

medvedv commented 7 years ago

How do you run purifier? And show cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages in case Xeon E3 or cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages in case E5

debian89 commented 7 years ago

Hello,

Sorry for my late reply. nr_hugepages is 1024 I am starting process with parameters: ./purifier -c 0x7 -n 4

medvedv commented 7 years ago

Hi,

Looks like it's ok. For -c 0x7 you need to have about 1,5 - 2 Gb of memory. But according to log you don't have enough memory (Cause: TCP_hash_create on core 2 failed). Try -#define PRF_TCP_CONN_HASH_SIZE (1 << 22) +#define PRF_TCP_CONN_HASH_SIZE (1 << 19) in prf_stateful.h