intel / intel-ipsec-mb

Intel(R) Multi-Buffer Crypto for IPSec
BSD 3-Clause "New" or "Revised" License
288 stars 87 forks source link

AESNI MB PMD in DPDK 20.11 has issue in supporting multiple process, leading to crash in intel-ipsec-mb #99

Closed riveridea closed 2 years ago

riveridea commented 2 years ago

https://github.com/intel/intel-ipsec-mb/blob/a6008dbddf15a397ecc2cafb7f0a00d69ebf8037/lib/avx512/mb_mgr_avx512.c#L1870

I find my code always crashes on accessing the sha1_one_block whose value is assigned here. As sha1_one_block_avx512 is the function name, I am wondering we should change this line to state->sha1_one_block = &sha1_one_block_avx512;

I am verifying if this is an issue. But what confuses me is the testapp does not have any crashing issue. Not sure it is platform dependent.

riveridea commented 2 years ago

This could be the DPDK PMD drier issue. I am looking into it.

tkanteck commented 2 years ago

Perhaps there is some issue with initialization

riveridea commented 2 years ago

I will close this issue. It seems to be device management issue in DPDK, the mb_mgr address is contaminated.

riveridea commented 2 years ago

close it.

riveridea commented 2 years ago

I think my issue is related to this thread: https://review.spdk.io/gerrit/c/spdk/dpdk/+/1056

riveridea commented 2 years ago

AES MB PMD is created successfully on one of my process, but I found the mb_mgr address is surprisingly contaminated after vdev_action runs on that process.

riveridea commented 2 years ago

I am currently using DPDK 20.11, and i am wondering if there is bug in the rte_aesni_mb_pmd.c In cryptodev_aesni_mb_create(), when allocating the mb_mgr, it does not check if this is primary process or secondary process. So if the secondary process lauch the scan/probe and create the pmd, the internals->mb_mgr could be updated by a new allocated mb_mgr. I changed the code to allocate he mb_mgr only on primary process. The crash is gone.

I believe this issue should have been resolve in later DPDK as the mb_mgr is no longer in the dev->data->dev_private.

tkanteck commented 2 years ago

If I remember correctly primary/secondary DPDK model is intended for failover scenarios only. For multi-core processing crypto scheduler PMD should be used. @pablodelara can you point to some example code covering failover?

riveridea commented 2 years ago

No matter what the primary/secondary processes are used for, the thing is I only explicitly init the aesni_mb device on my primary process. The secondary process implicitly triggers the device probe by the scan/probe procedure. And at the same time, the aesni_mb pmd in DPDK 20.11 has the bug I mentioned above. Then the memory crashes on primary process once the packet process tries to access that mb_mgr.

riveridea commented 2 years ago

Update the title and close it as this is DPDK issue.