Open lschapker opened 1 year ago
As a "shot in the dark", I reinstalled the kernel "pve-kernel-6.1.10-1-pve" (but still booting using "6.2.16-14-pve") and the "/dev/apex_?" show up again. It appears that there are issues when changing kernels (and cleaning up the old ones)...
I would use the following scripts to help keep your proxmox functioning well https://tteck.github.io/Proxmox/
Hello, I am new to the TPU "stuff." Sorry if this is a "newbie" issue.
Current PVE kernel: 6.2.16-14 (I believe the previous working kernel 6.2.16-10 was working, but I suspect that I had 6.1.10 still "installed". I believe 6.1.10 was "autoremoved" as part of the "standard" upgrade process).
Followed the instructions for adding the repository and the "keys". Reinstalled (i.e. apt-get reinstall ... I've also done an "apt-get purge ..." and the another "apt-get install ..." with no success) the "pve headers" and the 2 other packages. No errors reported. Prior to reboot, the "lspci/grep" can see the 2 TPU cores, but after rebooting, "ls /dev/ap*" shows nothing.
I see the following in the syslog:
Oct 01 17:51:49 proxmox-pr pvestatd[2645]: vm 110 - unable to parse config: lxc.mount.entry /dev/apex_0 dev/apex_0 none bind,optional,create=file 0, 0 Oct 01 17:51:49 proxmox-pr pvestatd[2645]: vm 110 - unable to parse config: lxc.mount.entry /dev/apex_1 dev/apex_1 none bind,optional,create=file 0, 0
I suspect this is because there are no "/dev/apex*" devices.
This was working at one time, but now is broken.
Not sure what to do now. Any suggestions would be appreciated!