Open bguijt opened 10 months ago
This is weird. There is definitely NVMe support and its working for me:
NODE DEV MODEL SERIAL TYPE UUID WWID MODALIAS NAME SIZE BUS_PATH SUBSYSTEM READ_ONLY SYSTEM_DISK
192.168.147.14 /dev/mmcblk0 - cxxxxxxxxx SD - - - BJTD4R 31 GB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/ /sys/class/block
192.168.147.14 /dev/mmcblk0boot0 - - SD - - - - 4.2 MB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot0 /sys/class/block *
192.168.147.14 /dev/mmcblk0boot1 - - SD - - - - 4.2 MB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot1 /sys/class/block *
192.168.147.14 /dev/nvme0n1 WD_BLACK SN770 500GB xxxxxxxxxxxx NVME e8238fa6-bf53-0001-001b-448b4e6e021d eui.e8238fa6bf530001001b448b4e6e021d - - 500 GB /platform/a40000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block
can you give me the output of:
talosctl -n 192.168.1.111 /proc/modules
talosctl -n 192.168.1.111 dmesg
(please upload this as a file)
Also, what kind of NVMe disk is this?
If this doesn't tell me anything I may ask you to downgrade a node to v1.6.0
$ talosctl -n 192.168.1.111 read /proc/modules
rockchip_cpufreq 16384 - - Live 0xffffb7d55a5ba000
rk808_regulator 49152 - - Live 0xffffb7d55a5a3000
rk8xx_spi 12288 - - Live 0xffffb7d55a5ca000
rk8xx_core 16384 - - Live 0xffffb7d55a5bf000
rk_crypto2 24576 - - Live 0xffffb7d55a5b1000
sm3_generic 12288 - - Live 0xffffb7d55a5a7000
rockchip_rng 12288 - - Live 0xffffb7d55a59d000
According to fdisk -l
the disk is a:
Disk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 244190646 sectors
Disk model: WD Blue SN570 1TB
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Output of dmesg
: dmesg.log
Ok, the rockchip phy comes up, but acts like there is nothing in the slot.
Two things you can try: adding a kernel parameter, using ubuntu with the 6.7 kernel. The first is a hunch, the second one will tell us something.
1) I have seen some reports on after some searching that the SN570 has problems when the kernel does not have parameter: pcie_aspm=off
.
In order to use this parameter in Talos, you need to define it in the machine config AND talosctl upgrade.
machine:
install:
extraKernelArgs:
- pcie_aspm=off
2) The normal ubuntu image is build with kernel 5.10, differences are two big to compare as it is a rockchip kernel. The mainline kernel image has a smaller difference in kernel to this Talos Kernel. The latest build is here: https://github.com/Joshua-Riek/ubuntu-rockchip/actions/runs/7700353474/artifacts/1203492516
Thanks @nberlee, I will check out and try your suggestions.
Hi @nberlee,
I tried adding extraKernelArgs to the machine config, and by both upgrading and re-flashing the image the kernel arg was not picked up - this is where I would expect it (using 1.6.4 release):
Kernel command line: BOOT_IMAGE=/A/vmlinuz talos.platform=metal console=tty0 console=ttyS9,115200 console=ttyS2,115200 talos.board=turing_rk1 sysctl.kernel.kexec_load_disabled=1 talos.dashboard.disabled=1 cma=128MB init_on_alloc=1 slab_nomerge pti=on consoleblank=0 nvme_core.io_timeout=4294967295 printk.devkmsg=on ima_template=ima-ng ima_appraise=fix ima_hash=sha512
Unknown kernel command line parameters "BOOT_IMAGE=/A/vmlinuz pti=on", will be passed to user space.
The ubuntu image you linked to gives me a working nvme drive. These are the modules loaded at /proc/modules
:
$ cat /proc/modules
binfmt_misc 24576 1 - Live 0x0000000000000000
nls_utf8 12288 1 - Live 0x0000000000000000
nls_cp936 139264 1 - Live 0x0000000000000000
crct10dif_ce 12288 1 - Live 0x0000000000000000
pwm_fan 20480 0 - Live 0x0000000000000000
hantro_vpu 262144 0 - Live 0x0000000000000000
rk_crypto2 32768 0 - Live 0x0000000000000000
sm3_generic 12288 1 rk_crypto2, Live 0x0000000000000000
v4l2_vp9 24576 1 hantro_vpu, Live 0x0000000000000000
crypto_engine 24576 1 rk_crypto2, Live 0x0000000000000000
nvmem_rockchip_otp 12288 0 - Live 0x0000000000000000
v4l2_h264 16384 1 hantro_vpu, Live 0x0000000000000000
sm3 20480 1 sm3_generic, Live 0x0000000000000000
v4l2_mem2mem 45056 1 hantro_vpu, Live 0x0000000000000000
videobuf2_dma_contig 24576 1 hantro_vpu, Live 0x0000000000000000
videobuf2_memops 16384 1 videobuf2_dma_contig, Live 0x0000000000000000
videobuf2_v4l2 32768 2 hantro_vpu,v4l2_mem2mem, Live 0x0000000000000000
videodev 311296 3 hantro_vpu,v4l2_mem2mem,videobuf2_v4l2, Live 0x0000000000000000
videobuf2_common 69632 5 hantro_vpu,v4l2_mem2mem,videobuf2_dma_contig,videobuf2_memops,videobuf2_v4l2, Live 0x0000000000000000
mc 81920 5 hantro_vpu,v4l2_mem2mem,videobuf2_v4l2,videodev,videobuf2_common, Live 0x0000000000000000
dm_multipath 40960 0 - Live 0x0000000000000000
dm_mod 159744 3 dm_multipath, Live 0x0000000000000000
scsi_dh_rdac 12288 0 - Live 0x0000000000000000
scsi_dh_emc 12288 0 - Live 0x0000000000000000
scsi_dh_alua 24576 0 - Live 0x0000000000000000
sch_fq_codel 16384 3 - Live 0x0000000000000000
ip_tables 32768 0 - Live 0x0000000000000000
x_tables 61440 1 ip_tables, Live 0x0000000000000000
autofs4 49152 2 - Live 0x0000000000000000
raid10 65536 0 - Live 0x0000000000000000
raid456 176128 0 - Live 0x0000000000000000
async_raid6_recov 20480 1 raid456, Live 0x0000000000000000
async_memcpy 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
async_pq 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
async_xor 16384 3 raid456,async_raid6_recov,async_pq, Live 0x0000000000000000
async_tx 16384 5 raid456,async_raid6_recov,async_memcpy,async_pq,async_xor, Live 0x0000000000000000
raid1 49152 0 - Live 0x0000000000000000
raid0 24576 0 - Live 0x0000000000000000
multipath 20480 0 - Live 0x0000000000000000
linear 16384 0 - Live 0x0000000000000000
rk808_regulator 53248 5 - Live 0x0000000000000000
phy_rockchip_snps_pcie3 16384 1 - Live 0x0000000000000000
rk8xx_spi 12288 0 - Live 0x0000000000000000
rk8xx_core 24576 1 rk8xx_spi, Live 0x0000000000000000
rockchip_rng 16384 0 - Live 0x0000000000000000
rockchipdrm 208896 0 - Live 0x0000000000000000
drm_dma_helper 24576 1 rockchipdrm, Live 0x0000000000000000
dw_hdmi 77824 1 rockchipdrm, Live 0x0000000000000000
dw_hdmi_qp 49152 1 rockchipdrm, Live 0x0000000000000000
dw_mipi_dsi 20480 1 rockchipdrm, Live 0x0000000000000000
analogix_dp 45056 1 rockchipdrm, Live 0x0000000000000000
drm_display_helper 188416 4 rockchipdrm,dw_hdmi,dw_hdmi_qp,analogix_dp, Live 0x0000000000000000
drm_kms_helper 229376 7 rockchipdrm,drm_dma_helper,dw_hdmi,dw_hdmi_qp,dw_mipi_dsi,analogix_dp,drm_display_helper, Live 0x0000000000000000
cec 77824 3 dw_hdmi,dw_hdmi_qp,drm_display_helper, Live 0x0000000000000000
drm 692224 8 rockchipdrm,drm_dma_helper,dw_hdmi,dw_hdmi_qp,dw_mipi_dsi,analogix_dp,drm_display_helper,drm_kms_helper, Live 0x0000000000000000
I attached:
output from dmesg
: dmesg-ubuntu-k6.70.log
output from lspcie
: lspci.log
output from lsblk -OJ
: lsblk.json
I don't know where to go from here!
Doing some research, I found out Ubuntu is loading module phy_rockchip_snps_pcie3
, produced by kernel config item PHY_ROCKCHIP_SNPS_PCIE3, which is not loaded by Talos because it is not selected: https://github.com/nberlee/pkgs/blob/3226d24e42cef17ec33c7d33bec553adc685e414/kernel/build/config-arm64#L7587
This the is the kernel config for 1.6.4: https://github.com/nberlee/pkgs/blob/release-1.6.4-turingrk1/kernel/build/config-arm64#L7931
Your link is from a commit 3 months old. (kernel 6.6.3 was never in a released version)
Thank you for the logs, I was still looking at it.
When you add extraKernelArgs you HAVE to use talosctl upgrade
to upgrade talos. As only this action changes the boot.
~But before you try disabling aspm, i've created a version which does not reset the pci device, and uses a different patch.
please upgrade using talosctl upgrade -i ghcr.io/nberlee/installer:v1.6.4-1-g0d07734c0-rk3588 -n <nodeip>
or use this image: https://github.com/nberlee/talos/actions/runs/7797613128/artifacts/1223121551~
EDIT: looked closely at your dmesg, and the problem is:
pci_bus 0000:01: busn_res: can not insert [bus 01-ff] under [bus 00-0f] (conflicts with (null) [bus 00-0f])
This bus is where you nvme is, I can see that from the ubuntu dmesg. I still have not a solution for this. I am trying to find something
can you try with irqchip.gicv3_pseudo_nmi=0
as extraKernelArgs and do a talosctl upgrade
?
Thanks Nico!
I just updated my RK1 boards - same result. I verified the kernel is loaded with that parameter, but the same bus conflict appeared (and ls -l /dev/
did not show any nvme devices)
Ok, thx for testing Bart,
This means no fast fix. I will create a testbuild with kernel 6.7.4. This is only for testing. To see I have to backport some more patches to LTS (6.6 LTS is the kernel Talos 1.7 is using)
I still have no clue why this doesn't work for you, but does me and others.
I created a testbuild:
image: https://github.com/nberlee/talos/actions/runs/7846554045/artifacts/1233812827
installer: v1.6.4-2-g9ce9b30bd-rk3588 (talosctl upgrade -i ghcr.io/nberlee/installer:v1.6.4-2-g9ce9b30bd-rk3588
)
I tested the build and upgraded my RK1's - still same result, unfortunately, with and without irqchip.gicv3_pseudo_nmi=0
:(
Well, this is good news, it means, it has to be the kernel config (or the BL31 u-boot binary blob Joshua uses). I will have a look again
Can you try this to see what u-boot thinks of your nvme drive?
=> pci enum
pcie_dw_rockchip pcie@fe180000: PCIe-0 Link Fail
=> nvme scan
=> nvme info
you can get into u-boot prompt by following Using picocom and pressing a key when u-boot asks for it.
Here it is:
U-Boot 2024.01 (Jun 01 2019 - 21:34:52 +0000)
Model: Turing Machines RK1
DRAM: 16 GiB (effective 15.7 GiB)
Core: 317 devices, 28 uclasses, devicetree: separate
MMC: mmc@fe2e0000: 0
Loading Environment from nowhere... OK
In: serial@febc0000
Out: serial@febc0000
Err: serial@febc0000
Model: Turing Machines RK1
Net: eth0: ethernet@fe1c0000
Hit any key to stop autoboot: 0
=> pci enum
pcie_dw_rockchip pcie@fe180000: PCIe-0 Link Fail
=> nvme scan
=> nvme info
Device 0: Vendor: 0x15b7 Rev: 234110WD Prod: 23252X801097
Type: Hard Disk
Capacity: 953869.7MB = 931.5 GB (244190646 x 4096)
=> nvme details
Blk device 0: Optional Admin Command Support:
Namespace Management/Attachment: no
Firmware Commit/Image download: yes
Formt NVM: yes
Security Send/Receive: yes
Blk device 0: Optional NVM Command Support:
Reservation: yes
Save/Select field in te Set/Get features: yes
Write Zeroes: yes
Dataset Management: yes
Write Uncorrectable: yes
Blk device 0: Format NVM Attriutes:
Support Cryptographic Erase: No
Support erase a particular namespace: Yes
Support format a particular namespace: Yes
Blk device 0: LBA Format Support:
LBA Foramt 0 Support:
Metadata Size: 0
LBA Data Size: 512
Relative Performance: Good
Blk device 0: End-to-End DataProtect Capabilities:
As last eight bytes: No
As first eight bytes: No
Support Type3: N
Support Type2: No
Support Type1: No
Blk device 0: Metadata capabilities:
As part of a separate buffer: No
As part of a extended data LBA: No
=>
Hi Bart,
I did some comprising in the kernel config and enabled some stuff:
CONFIG_PCIE_DW_EP=y
CONFIG_PCIE_DW_PLAT=y
CONFIG_PCIE_DW_PLAT_HOST=y
CONFIG_PCIE_DW_PLAT_EP=y
CONFIG_EFI_DISABLE_PCI_DMA=y
CONFIG_DW_DMAC_CORE=y
CONFIG_DW_DMAC_PCI=y
CONFIG_VFIO=y
CONFIG_VFIO_IOMMU_TYPE1=y
CONFIG_VFIO_PCI_CORE=y
CONFIG_VFIO_PCI=y
CONFIG_NVMEM_U_BOOT_ENV=y
These are pci related stuff which was enabled in Joshuas ubuntu kernel. And I think CONFIG_EFI_DISABLE_PCI_DMA is the one we actual need. The thing is, your nvme is already initialized by u-boot. and the memory address assignment is not cleared by the kernel. This causes the conflict. The kernel option clears the memory assignment first.
It would be great if you can test it today. As I think Talos 1.6.5 is going to be released today... If you can test, I can include CONFIG_EFI_DISABLE_PCI_DMA with my Talos 1.6.5 build.
Can you try either: installer (talosctl upgrade): ghcr.io/nberlee/installer:v1.6.4-2-ge0d096861-rk3588 or disk image: https://github.com/nberlee/talos/actions/runs/7929904512/artifacts/1250986261
Hi Nico,
I tested your specified Talos build by upgrading: talosctl -n 192.168.1.111 upgrade -i ghcr.io/nberlee/installer:v1.6.4-2-ge0d096861-rk3588
The upgrade succeeded fine, but the result is unfortunately still the same :-( See dmesg-16-02-2024-k6.6.16.log. I tested with and without kernel arg irqchip.gicv3_pseudo_nmi=0
.
Also tried talosctl -n 192.168.1.111 ls -l /dev/
to check for any nvme drives (only nvme-fabrics
showed up).
Thank you for the quick response. I have to think about what the next step is, as I am running out of ideas
Thank you for your time anyway, Nico, in trying to solve this anyway!
I still have no clue why this doesn't work for you, but does me and others.
Do you have a list of NVMe units which are confirmed to be working?
Until yesterday I only had your SN570 which was not working, yesterday it was reported on turingpi discord that the SN350 suffers from the same behavior.
Looking at various communications, I have confirmed that the following seems to work: WD Black SN770 500GB (mine (4) and a friend of mine (4)) ADATA SX8200PNP 2TB Transcent TS512GMTE400S 512GB INTEL SSDPEKKW128G7 128GB
I will try to add to this list. But when it works, people do not say what type of nvme they have... :)
I will :)
/dev/nvme0n1 Lexar SSD NM710 2TB
Both of my Crucial CT2000P3SSD8 2TB SSDs are working with Longhorn 👍
Had no issues with nvme. Started with 1.6.3 and upgraded to 1.6.5
❯ talosctl disks --talosconfig=./clusterconfig/talosconfig | grep nvme
192.168.254.42 /dev/nvme0n1 MSI M371 1TB 511230922208000019 NVME - eui.6479a7830ac01022 - - 1.0 TB /platform/a40000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block
192.168.254.43 /dev/nvme0n1 MSI M371 1TB 511230828064002710 NVME - eui.6479a781aac00736 - - 1.0 TB /platform/a40000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block
192.168.254.45 /dev/nvme0n1 MSI M371 1TB 511230828064002706 NVME - eui.6479a781aac00737 - - 1.0 TB /platform/a40000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block
192.168.254.44 /dev/nvme0n1 MSI M371 1TB 511230828064002712 NVME - eui.6479a781aac00865 - - 1.0 TB /platform/a40000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/nvme/nvme0/nvme0n1 /sys/class/block
Had no issues with nvme. Started with 1.6.3 and upgraded to 1.6.5
@bbck What NVMe units do you have, exactly?
They are MSI SPATIUM M371 1TB
@bguijt Could you try v1.6.5-1-gf5b2ba84c-rk3588. This has an extra patch which actually returns errors when probes fails for the pci controller (rockchip-dw.
The patch was submitted yesterday to kernel maintainers: https://github.com/nberlee/pkgs/blob/turingrk1/kernel/build/patches/arm64/0013-pci-dw-rockchip-Add-error-messages-in-.probe-s-error-paths.patch
Drivers that silently fail to probe provide a bad user experience and
make it unnecessarily hard to debug such a failure. Fix it by using
dev_err_probe() instead of a plain return.
Upgraded image:
$ talosctl upgrade -i ghcr.io/nberlee/installer:v1.6.5-1-g5fbd1aa43-rk3588 -n 192.168.1.111 --force
Result from talosctl -n 192.168.1.111 dmesg > dmesg-v1.6.5-1-gf5b2ba84c-rk3588.log
:
dmesg-v1.6.5-1-gf5b2ba84c-rk3588.log
Rescan PCI bus:
$ kubectl debug node/talos-tp1-n1 --image=alpine -n kube-system -it
Creating debugging pod node-debugger-talos-tp1-n1-vslx7 with container debugger on node talos-tp1-n1.
If you don't see a command prompt, try pressing enter.
/ # cd /host/sys/bus/pci
/host/sys/bus/pci # echo 1 > rescan
# wait for 1 minute....
Result dmesg:
192.168.1.111: kern: info: [2024-02-29T10:20:27.269855571Z]: pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
SUBSYSTEM=pci
DEVICE=+pci:0000:00:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.279628571Z]: pci 0000:01:00.0: [15b7:501a] type 00 class 0x010802
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.286616571Z]: pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.294431571Z]: pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x000000ff 64bit]
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.302905571Z]: pci 0000:01:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 0000:00:00.0 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.320220571Z]: pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
SUBSYSTEM=pci_bus
DEVICE=+pci_bus:0000:01
192.168.1.111: kern: info: [2024-02-29T10:20:27.327714571Z]: pcieport 0000:00:00.0: BAR 14: assigned [mem 0xf0300000-0xf03fffff]
SUBSYSTEM=pci
DEVICE=+pci:0000:00:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.335946571Z]: pci 0000:01:00.0: BAR 0: assigned [mem 0xf0300000-0xf0303fff 64bit]
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.344192571Z]: pci 0000:01:00.0: BAR 4: assigned [mem 0xf0304000-0xf03040ff 64bit]
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
192.168.1.111: kern: info: [2024-02-29T10:20:27.352961571Z]: nvme nvme0: pci function 0000:01:00.0
SUBSYSTEM=nvme
DEVICE=c236:0
192.168.1.111: kern: info: [2024-02-29T10:20:27.358279571Z]: nvme 0000:01:00.0: enabling device (0000 -> 0002)
SUBSYSTEM=pci
DEVICE=+pci:0000:01:00.0
# 1 minute waiting....
192.168.1.111: kern: warning: [2024-02-29T10:21:29.296505571Z]: nvme nvme0: I/O 16 QID 0 timeout, disable controller
SUBSYSTEM=nvme
DEVICE=c236:0
192.168.1.111: kern: err: [2024-02-29T10:21:29.320969571Z]: nvme nvme0: Identify Controller failed (-4)
SUBSYSTEM=nvme
DEVICE=c236:0
192.168.1.111: kern: warning: [2024-02-29T10:21:29.337498571Z]: nvme: probe of 0000:01:00.0 failed with error -5
I tried with the following extraKernelArgs
:
All tries yielded the same results, no difference in PCI / NVMe probing!
Ahh, the last part (after 1 minute waiting) is a open issue in Talos: https://github.com/siderolabs/talos/issues/5914
WD Black SN 770 1TB is in: Works straight away! This isolates the issue to the SN570 units.
Here is the dmesg log with the SN770, image tag v1.6.5-1-g5fbd1aa43-rk3588
, extraKernelArg irqchip.gicv3_pseudo_nmi=0
:
dmesg-v1.6.5-1-gf5b2ba84c-rk3588-irqchip.gicv3_pseudo_nmi-kernel-arg-sn770.log
Hi @nberlee, I decided to sell my SN570 units, and buy the SN770's instead. Many thanks for your support!
I have the same issue with my WD SN350. Might be able to help out with troubleshooting. I have no previous experience of messing around with kernels but willing to give it a try
@degisftw could you try image : https://github.com/nberlee/talos/actions/runs/8559139058/artifacts/1385853957 (unzip first) It has a new u-boot version + latest 6.6 kernel, but more importantly a patch for pcie 3 to not timeout after 0.5 seconds of initiation.
Sorry for a late reply.
I have tested the new image without any of the extra kernel arguments in this thread, but so far no dice :(
I will try to provide as much info as possible based on the previous discussion. Hope it helps
First some outputs from when running the Ubuntu image from the Turing Pi docs (ubuntu-22.04.3-preinstalled-server-arm64-turing-rk1_v1.33
).
fdisk -l
Disk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: WD Green SN350 1TB
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
sudo nvme smart-log /dev/nvme0
Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning : 0
temperature : 37 C (310 Kelvin)
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 40
data_units_written : 0
host_read_commands : 780
host_write_commands : 0
controller_busy_time : 0
power_cycles : 16
power_on_hours : 157
unsafe_shutdowns : 12
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 50 C (323 Kelvin)
Temperature Sensor 2 : 37 C (310 Kelvin)
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
With your new image
dmesg.log file
talosctl -n 192.168.50.245 ls -l /dev/
NODE MODE UID GID SIZE(B) LASTMOD NAME
192.168.50.245 drwxr-xr-x 0 0 3360 Jan 1 1970 01:00 .
192.168.50.245 Dcrw-r--r-- 0 0 0 Jan 1 1970 01:00 autofs
192.168.50.245 drwxr-xr-x 0 0 700 Jan 1 1970 01:00 block
192.168.50.245 drwxr-xr-x 0 0 60 Jan 1 1970 01:00 bus
192.168.50.245 drwxr-xr-x 0 0 2500 Jan 1 1970 01:00 char
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 console
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 cpu_dma_latency
192.168.50.245 drwxr-xr-x 0 0 180 Jan 1 1970 01:00 disk
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 efi_capsule_loader
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 efi_test
192.168.50.245 Lrwxrwxrwx 0 0 13 Jan 1 1970 01:00 fd -> /proc/self/fd
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 full
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 fuse
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 gpiochip0
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 gpiochip1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 gpiochip2
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 gpiochip3
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 gpiochip4
192.168.50.245 drwxr-xr-x 0 0 0 Jan 1 1970 01:00 hugepages
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 hwrng
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 i2c-0
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 i2c-1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 i2c-6
192.168.50.245 drwxr-xr-x 0 0 60 Jan 1 1970 01:00 input
192.168.50.245 Dcrw-r--r-- 0 0 0 Jan 1 1970 01:00 kmsg
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 kvm
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 loop-control
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop0
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop1
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop2
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop3
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop4
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop5
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop6
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 loop7
192.168.50.245 drwxr-xr-x 0 0 60 Jan 1 1970 01:00 mapper
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:23 mmcblk0
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 mmcblk0boot0
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 mmcblk0boot1
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:24 mmcblk0p1
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:24 mmcblk0p2
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:24 mmcblk0p3
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:24 mmcblk0p4
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:24 mmcblk0p5
192.168.50.245 Drw------- 0 0 0 Apr 10 14:41:25 mmcblk0p6
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 mmcblk0rpmb
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 mpt2ctl
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 mpt3ctl
192.168.50.245 drwxr-xr-x 0 0 60 Jan 1 1970 01:00 net
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 null
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 nvme-fabrics
192.168.50.245 Dcrw-r----- 0 0 0 Jan 1 1970 01:00 port
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 ptmx
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ptp0
192.168.50.245 drwxr-xr-x 0 0 0 Jan 1 1970 01:00 pts
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram0
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram1
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram10
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram11
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram12
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram13
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram14
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram15
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram2
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram3
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram4
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram5
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram6
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram7
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram8
192.168.50.245 Drw------- 0 0 0 Jan 1 1970 01:00 ram9
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 random
192.168.50.245 Lrwxrwxrwx 0 0 4 Jan 1 1970 01:00 rtc -> rtc0
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 rtc0
192.168.50.245 dtrwxrwxrwx 0 0 40 Jan 1 1970 01:00 shm
192.168.50.245 Lrwxrwxrwx 0 0 15 Jan 1 1970 01:00 stderr -> /proc/self/fd/2
192.168.50.245 Lrwxrwxrwx 0 0 15 Jan 1 1970 01:00 stdin -> /proc/self/fd/0
192.168.50.245 Lrwxrwxrwx 0 0 15 Jan 1 1970 01:00 stdout -> /proc/self/fd/1
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 tty
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty0
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty1
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty10
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty11
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty12
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty13
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty14
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty15
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty16
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty17
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty18
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty19
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty2
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty20
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty21
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty22
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty23
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty24
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty25
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty26
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty27
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty28
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty29
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty3
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty30
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty31
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty32
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty33
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty34
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty35
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty36
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty37
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty38
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty39
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty4
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty40
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty41
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty42
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty43
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty44
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty45
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty46
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty47
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty48
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty49
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty5
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty50
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty51
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty52
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty53
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty54
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty55
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty56
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty57
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty58
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty59
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty6
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty60
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty61
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty62
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty63
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty7
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty8
192.168.50.245 Dcrw--w---- 0 0 0 Jan 1 1970 01:00 tty9
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS0
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS2
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS3
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS4
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS5
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS6
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS7
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS8
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 ttyS9
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 urandom
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcs
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcs1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcsa
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcsa1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcsu
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vcsu1
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 vga_arbiter
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 vhost-net
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 vhost-vsock
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 vsock
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 watchdog
192.168.50.245 Dcrw------- 0 0 0 Jan 1 1970 01:00 watchdog0
192.168.50.245 Dcrw-rw-rw- 0 0 0 Jan 1 1970 01:00 zero
talosctl -n 192.168.50.245 disks
NODE DEV MODEL SERIAL TYPE UUID WWID MODALIAS NAME SIZE BUS_PATH SUBSYSTEM READ_ONLY SYSTEM_DISK
192.168.50.245 /dev/mmcblk0 - 0x5ed6e896 SD - - - BJTD4R 31 GB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/ /sys/class/block *
192.168.50.245 /dev/mmcblk0boot0 - - SD - - - - 4.2 MB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot0 /sys/class/block *
192.168.50.245 /dev/mmcblk0boot1 - - SD - - - - 4.2 MB /platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot1 /sys/class/block *
talosctl -n 192.168.50.245 ls -l /sys/block/
NODE MODE UID GID SIZE(B) LASTMOD NAME
192.168.50.245 drwxr-xr-x 0 0 0 Jan 1 1970 01:00 .
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop0 -> ../devices/virtual/block/loop0
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop1 -> ../devices/virtual/block/loop1
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop2 -> ../devices/virtual/block/loop2
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop3 -> ../devices/virtual/block/loop3
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop4 -> ../devices/virtual/block/loop4
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop5 -> ../devices/virtual/block/loop5
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop6 -> ../devices/virtual/block/loop6
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 loop7 -> ../devices/virtual/block/loop7
192.168.50.245 Lrwxrwxrwx 0 0 0 Jan 1 1970 01:00 mmcblk0 -> ../devices/platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 mmcblk0boot0 -> ../devices/platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot0
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 mmcblk0boot1 -> ../devices/platform/fe2e0000.mmc/mmc_host/mmc0/mmc0:0001/block/mmcblk0/mmcblk0boot1
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram0 -> ../devices/virtual/block/ram0
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram1 -> ../devices/virtual/block/ram1
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram10 -> ../devices/virtual/block/ram10
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram11 -> ../devices/virtual/block/ram11
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram12 -> ../devices/virtual/block/ram12
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram13 -> ../devices/virtual/block/ram13
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram14 -> ../devices/virtual/block/ram14
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram15 -> ../devices/virtual/block/ram15
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram2 -> ../devices/virtual/block/ram2
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram3 -> ../devices/virtual/block/ram3
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram4 -> ../devices/virtual/block/ram4
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram5 -> ../devices/virtual/block/ram5
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram6 -> ../devices/virtual/block/ram6
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram7 -> ../devices/virtual/block/ram7
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram8 -> ../devices/virtual/block/ram8
192.168.50.245 Lrwxrwxrwx 0 0 0 Apr 10 14:42:06 ram9 -> ../devices/virtual/block/ram9
Bug Report
I installed Talos like you demonstrated in the ascii video, on the
/dev/mmcblk0
block device. Now I want to use the attached NVMe drive for PVC resources, but I can't find the device in the list:Description
Instead of the
/dev/mmcblk0
device (in the controlplane.yaml file, path/machine/install/disk
) I tried several NVMe names, but they were not recognized:I tested the NVME drive with the Ubuntu distro on the same RK1 nodes, and they are working fine:
So the question is: How can I get the /dev/nvme0* devices to appear?
Logs
Environment