polarfire-soc / polarfire-soc-documentation

PolarFire SoC Documentation
Other
37 stars 19 forks source link

Expected read and write performance of NVMe? #76

Closed diarmuidcwc closed 2 years ago

diarmuidcwc commented 3 years ago

Hi

I just programmed in the latest 2021.04 FPGA image and Yocto build. I am evaluating the PolarFireSOC for a product that writes Ethernet packets to an NVMe device.

My initial basic benchmarking attempts using dd (dd if=/dev/zero) are showing quite slow write performances, in the order of 23MB/s.

Do you have an internal benchmarks for NVMe write performance? At the moment I'm not sure if the limit is the CPU or the PCIe interface, I suspect the former. even though dd should be trivial, but I will be investigating this in the coming days. However if you have any internal documentation or benchmarks, could you share them

Regards Diarmuid

hughbreslin commented 3 years ago

Hey @diarmuidcwc we are investigating performance issues with PCIe at the moment as there have been a few reports of issues in relation to throughput. Have you seen this issue? I'm trying to find what information I can share with you in terms of benchmarks, I'll get back to you when I have something useful.

Cheers, Hugh

diarmuidcwc commented 3 years ago

Hugh

I did just rundimentary test using dd and fio. Mostly dd as fio is very slow on the board.

root@icicle-kit-es:/mnt/nvme# dd if=/dev/zero of=test_file bs=10M count=100 conv=fdatasync 100+0 records in 100+0 records out 1048576000 bytes (1.0 GB, 1000 MiB) copied, 35.5783 s, 29.5 MB/s

I am getting a lot of timeout errors like this:

[ 554.105809] nvme nvme0: I/O 228 QID 2 timeout, completion polled

My NVMe has some activity LEDs and when doing this tests, the leds only briefly flash during the test, leading me to guess that the issue is not on the NVMe. It suggests that accesses are brief with long delays between them. This would be consistent with timeouts.

mcnamarad1971 commented 3 years ago

Yeah, the work we're doing in-house suggests we're CPU-bound, in general. We're digging into exactly why that is at the moment.

For example, dd only runs on one CPU and each dd process maxes out, at the moment, around 29 / 30 MB/s. But the PCIe/NVMe system is still mostly idle. To see what I mean, you should be able to log in 4 times, for example 4 ssh sessions, and run the dd command above in each session, and get in or around the same performance (say ~20MB/s) on each job.

diarmuid commented 3 years ago

Thanks guys. Good news for me that others are seeing this !

hughbreslin commented 3 years ago

Hey @diarmuid our latest reference design and Linux releases contain changes for the PCIe which have shown performance improvements for NVMe drives if you want to give it a try and see if this also improves things for you?

diarmuidcwc commented 3 years ago

Thanks. I'll check it out

diarmuidcwc commented 3 years ago

Not so sure about this improvements. Maybe it's my particular setup. I took the 2021.08 FPGA image + wic. I'll try with a different NVMe device

hughbreslin commented 3 years ago

Hi @diarmuid any luck with the other NVMe?

KodrAus commented 2 years ago

Anecdotally things seem more stable to me with 2022.02 release. I used to see pretty frequent corruptions interacting with NVMe (I've got a WD Blue 250GB SSD attached) but haven't seen any lately.