Closed danmanners closed 2 years ago
ubuntu@tpiv2-node-1:~$ sudo lspci -vvvv
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a809 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device a801
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 42
Region 0: Memory at 600000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/32 Maskable- 64bit+
Address: 00000000fffffffc Data: 6540
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s (downgraded), Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP-, LTR+
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable- Count=13 Masked+
Vector table: BAR=0 offset=00003000
PBA: BAR=0 offset=00002000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158 v1] Power Budgeting <?>
Capabilities: [168 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Capabilities: [188 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [190 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Kernel driver in use: nvme
Kernel modules: nvme
Running Jeff Geerling's Disk Benchmark:
ubuntu@tpiv2-node-1:~$ wget -q https://raw.githubusercontent.com/geerlingguy/raspberry-pi-dramble/master/setup/benchmarks/disk-benchmark.sh
ubuntu@tpiv2-node-1:~$ chmod +x disk-benchmark.sh
ubuntu@tpiv2-node-1:~$ sudo DEVICE_UNDER_TEST=/dev/nvme0n1p1 DEVICE_MOUNT_PATH=/mnt/nvme ./disk-benchmark.sh
Results:
Benchmark | Result |
---|---|
fio 1M sequential read | 416 MB/s |
iozone 1M random read | 210.97 MB/s |
iozone 1M random write | 188.70 MB/s |
iozone 4K random read | 14.77 MB/s |
iozone 4K random write | 25.38 MB/s |
```bash Running fio sequential read test... fio-rand-read-sequential: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=64 ... fio-3.16 Starting 4 processes Jobs: 4 (f=4): [R(4)][36.4%][r=395MiB/s][r=394 IOPS][eta 00m:07s] Jobs: 4 (f=4): [R(4)][50.0%][r=395MiB/s][r=395 IOPS][eta 00m:05s] Jobs: 4 (f=4): [R(4)][63.6%][r=402MiB/s][r=401 IOPS][eta 00m:04s] Jobs: 4 (f=4): [R(4)][80.0%][r=396MiB/s][r=396 IOPS][eta 00m:02s] Jobs: 4 (f=4): [R(4)][100.0%][r=399MiB/s][r=399 IOPS][eta 00m:00s] fio-rand-read-sequential: (groupid=0, jobs=4): err= 0: pid=8843: Sat Jan 8 17:06:40 2022 read: IOPS=397, BW=397MiB/s (416MB/s)(4039MiB/10172msec) slat (usec): min=148, max=50238, avg=9887.74, stdev=11251.53 clat (msec): min=150, max=831, avg=621.12, stdev=74.51 lat (msec): min=170, max=843, avg=631.01, stdev=74.80 clat percentiles (msec): | 1.00th=[ 262], 5.00th=[ 542], 10.00th=[ 567], 20.00th=[ 592], | 30.00th=[ 600], 40.00th=[ 617], 50.00th=[ 625], 60.00th=[ 642], | 70.00th=[ 651], 80.00th=[ 667], 90.00th=[ 693], 95.00th=[ 709], | 99.00th=[ 743], 99.50th=[ 760], 99.90th=[ 810], 99.95th=[ 835], | 99.99th=[ 835] bw ( KiB/s): min=286720, max=462619, per=98.94%, avg=402280.67, stdev=11273.37, samples=77 iops : min= 280, max= 451, avg=392.52, stdev=11.00, samples=77 lat (msec) : 250=0.94%, 500=2.87%, 750=95.54%, 1000=0.64% cpu : usr=0.09%, sys=3.88%, ctx=4939, majf=0, minf=65623 IO depths : 1=0.1%, 2=0.2%, 4=0.4%, 8=0.8%, 16=1.6%, 32=3.2%, >=64=93.8% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=4039,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): READ: bw=397MiB/s (416MB/s), 397MiB/s-397MiB/s (416MB/s-416MB/s), io=4039MiB (4235MB), run=10172-10172msec Disk stats (read/write): nvme0n1: ios=16044/4, merge=0/0, ticks=2527898/31, in_queue=2495852, util=99.01% Running iozone 1024K random read and write tests... Iozone: Performance Test of File I/O Version $Revision: 3.492 $ Compiled for 64 bit mode. Build: linux-arm Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer, Vangel Bojaxhi, Ben England, Vikentsi Lapa, Alexey Skidanov, Sudhir Kumar. Run began: Sat Jan 8 17:06:40 2022 Include fsync in write timing O_DIRECT feature enabled Auto Mode File size set to 102400 kB Record Size 1024 kB Command line used: ./iozone -e -I -a -s 100M -r 1024k -i 0 -i 2 -f /mnt/nvme/iozone Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 102400 1024 217107 219792 216043 193229 iozone test complete. Running iozone 4K random read and write tests... Iozone: Performance Test of File I/O Version $Revision: 3.492 $ Compiled for 64 bit mode. Build: linux-arm Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer, Vangel Bojaxhi, Ben England, Vikentsi Lapa, Alexey Skidanov, Sudhir Kumar. Run began: Sat Jan 8 17:07:11 2022 Include fsync in write timing O_DIRECT feature enabled Auto Mode File size set to 102400 kB Record Size 4 kB Command line used: ./iozone -e -I -a -s 100M -r 4k -i 0 -i 2 -f /mnt/nvme/iozone Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 102400 4 20223 29942 15131 25991 iozone test complete. Disk benchmark complete! ```
This is all validated on Ubuntu 20.04.3 LTS with the Compute Module 4 with 4GiB of memory.
ubuntu@tpiv2-node-1:~$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
ubuntu@tpiv2-node-1:~$ uname -a
Linux tpiv2-node-1 5.4.0-1048-raspi #53-Ubuntu SMP PREEMPT Wed Dec 8 13:06:23 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
Interesting seeing the 1M block size random IO being quite a bit slower than other high-end NVMe drives like Kioxia's XG6.
Also, thanks so much for submitting the info for this drive and card. I know a few people have asked about the Samsung 980 (I think I've only ever tried the 970), so it's good to know it works at least!
Closing as the pages are up in the database. Feel free to post any more info to the issue though!
Interesting seeing the 1M block size random IO being quite a bit slower than other high-end NVMe drives like Kioxia's XG6.
Definitely curious about what's going on there. May try the same thing with RasPi OS 32 and 64-bit and see if it's any different/better.
I'm verifying the functionality of the Turing Pi v2 mPCIe slots for nodes 1 and two on the pre-production unit. In order to connect an NVMe drive, I required an mPCIe to M.2 adapter board.
Samsung 980 M.2 NVMe SSD
Amazon Link: Purchase here
Sintech mPCIe to M.2 Adapter (with 20cm adapter)
Amazon Link: Purchase here