Storage (nvme) SSD no longer appears via synology after shutdown/reboot

ianhundere commented 7 months ago

i accidentally disconnected my NAS from power and when turning it back on the ssd storage is no longer recognized even tho it appears when rerunning the script.

i went ahead and ran restore then the script again:

root@istorage:/volume1/media/incomplete# sudo -s ./syno_hdd_db.sh --restore --showedits
Synology_HDD_db v3.4.84
DS723+ DSM 7.2.1-69057-3 
StorageManager 1.0.0-0017

Using options: --restore --showedits
Running from: /volume1/media/incomplete/syno_hdd_db.sh

Restored support_memory_compatibility = yes
No backup of storage_panel.js found.

Restored ds723+_host_v7.db
Restored dx1211_v7.db
Restored dx1215ii_v7.db
Restored dx1215_v7.db
Restored dx1222_v7.db
Restored dx213_v7.db
Restored dx510_v7.db
Restored dx513_v7.db
Restored dx517_v7.db
Restored dx5_v7.db
Restored fax224_v7.db
Restored fx2421_v7.db
Restored rx1211rp_v7.db
Restored rx1211_v7.db
Restored rx1213sas_v7.db
Restored rx1214rp_v7.db
Restored rx1214_v7.db
Restored rx1216sas_v7.db
Restored rx1217rp_v7.db
Restored rx1217sas_v7.db
Restored rx1217_v7.db
Restored rx1222sas_v7.db
Restored rx1223rp_v7.db
Restored rx1224rp_v7.db
Restored rx2417sas_v7.db
Restored rx410_v7.db
Restored rx415_v7.db
Restored rx418_v7.db
Restored rx4_v7.db
Restored rx6022sas_v7.db
Restored rxd1215sas_v7.db
Restored rxd1219sas_v7.db

Restore successful.
root@istorage:/volume1/media/incomplete# sudo -s ./syno_hdd_db.sh -nr --showedits
Synology_HDD_db v3.4.84
DS723+ DSM 7.2.1-69057-3 
StorageManager 1.0.0-0017

Using options: -nr --showedits
Running from: /volume1/media/incomplete/syno_hdd_db.sh

HDD/SSD models found: 1
ST8000VN004-3CP101,SC60

M.2 drive models found: 1
WD Red SN700 500GB,111150WD

No M.2 PCIe cards found

No Expansion Units found

ST8000VN004-3CP101 already exists in ds723+_host_v7.db
Edited unverified drives in ds723+_host_v7.db
Added WD Red SN700 500GB to ds723+_host_v7.db

Support disk compatibility already enabled.

Disabled support memory compatibility.

Max memory is set to 32 GB.

NVMe support already enabled.

M.2 volume support already enabled.

Disabled drive db auto updates.

    "ST8000VN004-3CP101": {
      "SC60": {
        "compatibility_interval": [
          {
            "compatibility": "support",
            "not_yet_rolling_status": "support",
            "fw_dsm_update_status_notify": false,
            "barebone_installable": true,
            "smart_test_ignore": false,
            "smart_attr_ignore": false
          }
        ]

    "WD Red SN700 500GB": {
      "111150WD": {
        "compatibility_interval": [
          {
            "compatibility": "support",
            "not_yet_rolling_status": "support",
            "fw_dsm_update_status_notify": false,
            "barebone_installable": true,
            "smart_test_ignore": false,
            "smart_attr_ignore": false
          }
        ]

DSM successfully checked disk compatibility.

You may need to reboot the Synology to see the changes.

any suggestions?

ianhundere commented 7 months ago

ended up following the steps here:

You could try shutting down the NAS, remove the NVMe drive, bootup, shut down, insert NVMe drive and boot up to see if it clears the error.

ianhundere commented 6 months ago

hmm, it seems like this is the case even if i shutdown/restart the nas. any suggestions to avoid this behavior?

007revad commented 6 months ago

Is this still an issue?

ianhundere commented 6 months ago

yeah, lemme change the title of the issue.

007revad commented 6 months ago

Do you have the script scheduled to run as root at boot-up? https://github.com/007revad/Synology_HDD_db/blob/main/how_to_schedule.md

ianhundere commented 6 months ago

i do, and this is the latest output:

Synology_HDD_db v3.4.84
DS723+ DSM 7.2.1-69057-3 
StorageManager 1.0.0-0017

Using options: -nr --autoupdate=3
Running from: /volume1/scripts/syno_hdd_db.sh

HDD/SSD models found: 1
ST8000VN004-3CP101,SC60

M.2 drive models found: 2
WD Red SN700 500GB,111150WD
WD_BLACK SN850X 4000GB,624311WD

No M.2 PCIe cards found

No Expansion Units found

[0;33mST8000VN004-3CP101[0m already exists in [0;36mds723+_host_v7.db[0m
[0;33mWD Red SN700 500GB[0m already exists in [0;36mds723+_host_v7.db[0m
[0;33mWD_BLACK SN850X 4000GB[0m already exists in [0;36mds723+_host_v7.db[0m

Support disk compatibility already enabled.

Support memory compatibility already disabled.

Max memory is set to 32 GB.

NVMe support already enabled.

M.2 volume support already enabled.

Drive db auto updates already disabled.

DSM successfully checked disk compatibility.

You may need to [0;36mreboot the Synology[0m to see the changes.

007revad commented 6 months ago

Are both WD Red SN700 500GB and WD_BLACK SN850X 4000GB missing after a reboot?

What happens if you disable the syno_hdd_db schedule and then reboot?

BTW if you change the schedule to use -nre --autoupdate=3 that will get rid of the �[0 in the output.

ianhundere commented 6 months ago

Are both WD Red SN700 500GB and WD_BLACK SN850X 4000GB missing after a reboot?

ah, good question. just WD_BLACK SN850X 4000GB is missing.

What happens if you disable the syno_hdd_db schedule and then reboot?

i have not tried this yet.

BTW if you change the schedule to use -nre --autoupdate=3 that will get rid of the �[0 in the output.

cheers / thanks for the heads up

ianhundere commented 6 months ago

What happens if you disable the syno_hdd_db schedule and then reboot?

just tried this and booting up had no issue. 🤔

ianhundere commented 6 months ago

any news on this or maybe i should simply disable the syno_hdd_db schedule for now?

007revad commented 6 months ago

Disable the syno_hdd_db schedule for now.

I'll think up some things for to test with and without the script so we can see what the difference is.

007revad commented 6 months ago

Can you download the following test script: https://github.com/007revad/Synology_HDD_db/blob/test/nvme_check.sh

Then run nvme_check.sh and report back the output.

Then enable the syno_hdd_db schedule, reboot, run nvme_check.sh and report back the output.

ianhundere commented 6 months ago

okay, i just tested whereby i:

ran nvme_check.sh
enabled syno_hdd_db
restarted the synology

this was the output: before

DS723+
DSM 7.2.1-69057 Update 3
2024-03-05 10:36:44

 Checking support_m2_pool setting
/etc.defaults/synoinfo.conf: yes
/etc/synoinfo.conf:          yes

 Checking supportnvme setting
/etc.defaults/synoinfo.conf: yes
/etc/synoinfo.conf:          yes

 Checking synodisk --enum -t cache
************ Disk Info ***************
>> Disk id: 1
>> Disk path: /dev/nvme0n1
>> Disk model: WD Red SN700 500GB                      
>> Total capacity: 465.76 GB
>> Tempeture: 39 C
************ Disk Info ***************
>> Disk id: 2
>> Disk path: /dev/nvme1n1
>> Disk model: WD_BLACK SN850X 4000GB                  
>> Total capacity: 3726.02 GB
>> Tempeture: 40 C

 Checking syno_slot_mapping
----------------------------------------
System Disk
Internal Disk
01: /dev/sata1
02: /dev/sata2

Esata port count: 1
Esata port 1
01:

USB Device
01:

Internal SSD Cache:
01: /dev/nvme0n1
02: /dev/nvme1n1

----------------------------------------

 Checking udevadm nvme paths
nvme0: /devices/pci0000:00/0000:00:01.2/0000:01:00.0/nvme/nvme0
nvme1: /devices/pci0000:00/0000:00:01.3/0000:02:00.0/nvme/nvme1

 Checking if nvme drives are detected with synonvme
nvme0: It is a NVMe SSD
nvme0: Vendor name: Sandisk
nvme0: Model name: WD Red SN700 500GB             
nvme1: It is a NVMe SSD
nvme1: Vendor name: Sandisk
nvme1: Model name: WD_BLACK SN850X 4000GB         

 Checking nvme drives in /run/synostorage/disks
nvme0n1
nvme1n1

 Checking nvme block devices in /sys/block
nvme0n1
nvme1n1

 Checking logs
----------------------------------------
Current date/time:   2024-03-05 10:36:44
Last boot date/time: 2024-02-25 20:11:45
----------------------------------------

this issue didn't happen on a reboot/restart, so i tried a shutdown and it occurred: after

DS723+
DSM 7.2.1-69057 Update 3
2024-03-05 10:53:12

 Checking support_m2_pool setting
/etc.defaults/synoinfo.conf: yes
/etc/synoinfo.conf:          yes

 Checking supportnvme setting
/etc.defaults/synoinfo.conf: yes
/etc/synoinfo.conf:          yes

 Checking synodisk --enum -t cache
************ Disk Info ***************
>> Disk id: 1
>> Disk path: /dev/nvme0n1
>> Disk model: WD Red SN700 500GB                      
>> Total capacity: 465.76 GB
>> Tempeture: 41 C

 Checking syno_slot_mapping
----------------------------------------
System Disk
Internal Disk
01: /dev/sata1
02: /dev/sata2

Esata port count: 1
Esata port 1
01:

USB Device
01:

Internal SSD Cache:
01: /dev/nvme0n1
02:

----------------------------------------

 Checking udevadm nvme paths
nvme0: /devices/pci0000:00/0000:00:01.2/0000:01:00.0/nvme/nvme0

 Checking if nvme drives are detected with synonvme
nvme0: It is a NVMe SSD
nvme0: Vendor name: Sandisk
nvme0: Model name: WD Red SN700 500GB             

 Checking nvme drives in /run/synostorage/disks
nvme0n1

 Checking nvme block devices in /sys/block
nvme0n1

 Checking logs
----------------------------------------
Current date/time:   2024-03-05 10:53:13
Last boot date/time: 2024-03-05 10:47:42
----------------------------------------

ianhundere commented 6 months ago

and now when i do the following the error still isn't clearing / WD_BLACK ssd not available: "You could try shutting down the NAS, remove the NVMe drive, bootup, shut down, insert NVMe drive and boot up to see if it clears the error." 😬

edit: any suggestions on clearing the error / getting my nvme back ? 🙏🏼

edit2: tried doing things i had done before following the steps above like running --restore and then applying the script back. i've tried the steps above about a dozen times. i also tried switching the ssd slot i had WD_BLACK in. no change so far. 🤪

edit3: i did notice i was using an older vers of the script, so i updated to latest, 3.4.86, but still getting the same issues whereby the WD_BLACK ssd is not recognized / seen.

edit4: also threw it into a pc / not seeing any issues w/ it. Screenshot 2024-03-05 144310

007revad commented 6 months ago

It looks like either the WD Black NVMe drive is faulty.

ianhundere commented 6 months ago

hmm, it's showing up no problem on the pc. 🤔

oh well, will just move it to gaming pc. thanks for all your help/support!

ianhundere commented 6 months ago

It looks like either the WD Black NVMe drive is faulty.

just a heads up that i don't believe this is the case. i'm using it w/o issue on a pc currently.

007revad commented 6 months ago

EDIT You can skip this comment and the next 6 comments and jump straight to https://github.com/007revad/Synology_HDD_db/issues/237#issuecomment-1982148975

I think I know what's going on.

In DSM 7.2.1 Update 2 or Update 3 Synology added a power limit to NVMe drives. It's actually a maximum power limit and minimum power. Different Synology NAS models have different power limits.

On a real Synology you can check the power limit with:

cat /sys/firmware/devicetree/base/power_limit && echo

Power limits I've seen in DSM 7.2.1 are:

power_limit = "14.85,14.85";    E10M20-T1 and M2D20
power_limit = "9.9,9.9";    M2D18

power_limit = "14.85,11.55";    DS420+
power_limit = "14.85,9.9";  DS923+, DS723+ and DS423+
power_limit = "14.85,9.075";    DS1821+ and DS1520+
power_limit = "14.85,7.425";    DS920+
power_limit = "14.85,7.26"; DS1621+

power_limit = "11.55,9.075";    DS1823xs+
power_limit = "11.55,5.775";    DS720+

power_limit = "14.85,14.85";    other models

It could be a case of that the 4TB WD BLACK SN850X needs too much power when it's hot from being powered on for a while so the Synology refuses to mount it after a reboot. But if you boot when the WD BLACK is cooler DSM mounts it okay and everything is fine until you reboot.

007revad commented 6 months ago

It would be interesting to see what the following commands return:

/usr/syno/lib/systemd/scripts/nvme_power_state.sh --list -d nvme0

/usr/syno/lib/systemd/scripts/nvme_power_state.sh --list -d nvme1

007revad commented 6 months ago

Synology does have a script to set the power limits but I've not tried it because I haven't needed to until now:

/usr/syno/lib/systemd/scripts/nvme_power_state.sh --help
Description: support tool for set nvme power state

Options:
        -a | --auto-watt ${watt}                : set all NVMe device to the power state which max power is less or equal to ${watt}W
                                                  if device do not have the power state we will not set it, please notice output log
                                                  cannot be used with -d -s and -p
        -d | --device ${device}                 : not set all NVMe device but just ${device}, need to follow any one of the following parameter
                                                  cannot be used with --auto
        -p | --power-watt ${watt}               : set the NVMe device to the power state which max power is less or equal to ${watt}W
                                                  if device do not have the power state we will not set it, please notice output log
                                                  cannot be used with --auto
        -s | --power-state ${power_state}       : set the NVMe device to the power state ${power_state}
                                                  if the power state is non-operational we will not set it, please notice output log
                                                  cannot be used with --auto
        -l | --list                             : list all power state of NVMe devices
        -t | --add-task                         : add task schedule named SYNOLOGY_${device}_power_customization
        --delete-task                           : delete all tasks called SYNOLOGY_${all_nvme_device}_power_customization

Usage:
        ./nvme_power_state.sh -a 6 -l -t
                set all devices to power state which max power is less or equal to 6W
                list all power states of all devices and set the successful result to task schedule
        ./nvme_power_state.sh -d nvme1 -p 6 -l
                set nvme1 to power state which max power is less or equal to 6W and list all power states of nvme1
        ./nvme_power_state.sh -d nvme1 -s 1 -t
                set nvme1 to power state 1 and set the successful result to task schedule
        ./nvme_power_state.sh --delete-task
                delete all tasks called SYNOLOGY_${all_nvme_device}_power_customization

007revad commented 6 months ago

There are 5 power states that can be set and scheduled... but I assume those are set by DSM automatically from the power_limit.

/usr/syno/lib/systemd/scripts/nvme_power_state.sh --list -d nvme0

========== list all power states of nvme0 ==========
ps 0:   max_power 4.70W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:0.3000W active_power:4.02 W      operational     rrt 0   rrl 0   rwt 0  rwl 0
ps 1:   max_power 3.00W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:0.3000W active_power:3.02 W      operational     rrt 0   rrl 0   rwt 0  rwl 0
ps 2:   max_power 2.20W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:0.3000W active_power:2.02 W      operational     rrt 0   rrl 0   rwt 0  rwl 0
ps 3:   max_power 0.0150W non-operational enlat:1500 exlat:2500 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:0.0150 W     non-operational rrt 3   rrl 3   rwt 3   rwl 3
ps 4:   max_power 0.0050W non-operational enlat:10000 exlat:6000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:0.0050 W    non-operational rrt 4   rrl 4   rwt 4   rwl 4
ps 5:   max_power 0.0033W non-operational enlat:176000 exlat:25000 rrt:5 rrl:5 rwt:5 rwl:5 idle_power:0.0033 W  non-operational rrt 5   rrl 5   rwt 5   rwl 5

========== nvme0 result ==========
ps 0:   max_power 4.70W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:0.3000W active_power:4.02 W      operational     rrt 0   rrl 0   rwt 0  rwl 0

add to task schedule? false

007revad commented 6 months ago

I can easily change the power limits for any Synology model that has a model.dtb file, like the DS723+... but we would only want to increase it just enough to prevent the issue you're seeing.

007revad commented 6 months ago

You could also try disabling game mode on the WD Black NVMe drive. You'd need to do that while it's in a PC.

https://9to5toys.com/2022/09/27/wd_black-sn850x-review/

I probably should have mentioned this first.

007revad commented 6 months ago

I was thinking about this and remembered the WD Black NVMe drive only goes missing when you have syno_hdd_db scheduled to run at boot. So clearly something the script does is triggering the issue.

I'll create a debug version of the script that will pause in places that I suspect could cause this behaviour so while the script is paused you can close and re-open storage manager and check if the WD Black is still showing.

007revad commented 6 months ago

Try this debug verion: syno_hdd_debug.zip

You'll need to run it via SSH.

Disable the syno_hdd_db schedule.
Reboot and check that the WD Black NVMe drive shows up in Storage Manager.
Run synohdddebug.sh via SSH.
When it pauses, close and re-open Storage Manager then check that the WD Black NVMe drive shows up in Storage Manager.
If the WD Black was showing in Storage Manager go back to the shell window and press any key to continue the script.
When it pauses at where the WD Black NVMe drive vanishes note the line number showing in the shell.

ianhundere commented 6 months ago

👋🏼 sorry for the delay and thanks for looking into this, but unfortunately, i've already moved the ssd to another computer.

i did look into the power consumption of each slot and even switched the slots the ssds were in. w/o change, but your script above looks promising. i wish i could test it for ya!

thanks again 🙇🏼

BobbyPancakes commented 5 months ago

I have this issue. Same drive. I can get it working by pulling the drive, booting, shutting down, replacing, booting. But that only works half the time so I have to keep repeating. I don't believe it has anything to do with heat. I seem to have better luck if I remove the drive, boot, then let the nas sit on without the drive in for a few minutes, then re-install the drive. But even that isn't consistent which is frustrating and time consuming. I'm going to try some of the steps listed here. I could really use a fix because I cant continue to run this system like this.

007revad commented 5 months ago

@BobbyPancakes

Have you tried running syno_hdd_db with the -n option?

If that doesn't solve it can you disable your syno_hdd_db schedule, if you have it scheduled. Then reboot and see if the if the NVMe drive shows up in storage manager.

If it does show up in storage manager, then run this debug version syno_hdd_debug.zip

You'll need to run it via SSH.

Run syno_hdd_debug.sh via SSH.
When it pauses, close and re-open Storage Manager then check that the WD Black NVMe drive shows up in Storage Manager.
If the WD Black was showing in Storage Manager go back to the shell window and press any key to continue the script.
When it pauses at where the WD Black NVMe drive vanishes note the line number showing in the shell.

morphias2004 commented 4 months ago

I am experiencing the same issue with an SK Hynix BC501 256GB drive.

I have a scheduled task set to run syno_hdd_db.sh -nr --autoupdate=3 at startup.

Disabling the task stops the issue from happening.

I'll give the debug script a go later this week to help continue with coming up with a permanent solution.

007revad commented 2 months ago

@morphias2004

I'll give the debug script a go later this week to help continue with coming up with a permanent solution.

Did you manage to find a solution?

007revad / Synology_HDD_db

Storage (nvme) SSD no longer appears via synology after shutdown/reboot #237