Device: /dev/twa0 [3ware_disk_13], 2 Currently unreadable (pending) sectors

mitar commented 6 years ago

For some time now server3 has 2 unreadable (pending) sectors. This is good, because it means the problem is not growing and we can keep those disks. But this should still be fixed.

This message was generated by the smartd daemon running on:

   host name:  server3
   DNS domain: cloyne.org

The following warning/error was logged by the smartd daemon:

Device: /dev/twa0 [3ware_disk_13], 2 Currently unreadable (pending) sectors

Device info:
ST3750640NS, S/N:3QD0B5XJ, FW:3.AEE, 750 GB

So the issue here is that disk detected two sectors to be bad. This is why they are pending. You should resolve those pending sectors and get disk to remove them from use and replace them with space sectors. The process is a bit involved, but it is fun:

you first remove the offending disk from the RAID (you leave the other mirror disk in RAID, so there is no service disruption)
once out of RAID, you override the whole disk (dd tool) with write blocks of size equal to disk blocks with zeros
once you override an offending sector with zeros, disk knows what is the right content of that sector and swaps the spare sector in, lowering this count (and probably increasing "uncorrectable" count)
then you have to reformat disk back for mdn RAID (fdisk tool)
and reattach it back to RAID so that mirror is restored and original data is copied back

I suggest you go through and figure our all commands to do this and then log them here.

mitar commented 6 years ago

And use smartctl tool to see the status before and after dd so that you know if you really fixed the problem.

clonm commented 6 years ago

Are you saying to physically remove the disk and do this from a different machine, or do you mean something from a shell?

mitar commented 6 years ago

Everything from the shell. You just remove it from md (software) RAID using md command. See cat /proc/mdstat to see how raid devices are made and which drives are there. Read on md raid.

I suggest that before you do stuff, write a proposal how to do it here. Oh, and there are commands for this in the bash history probably, when I was doing this last time.

ecawthon commented 6 years ago

The tail end of something with /dev/twa0 is at the very beginning of the .bash_history archive I made. Based on that + man pages, I ran the following read-only commands:

[x] sudo lshw | less, relevant part:

       *-storage
            description: RAID bus controller
            product: 9650SE SATA-II RAID PCIe
            vendor: 3ware Inc
            physical id: 0
            bus info: pci@0000:01:00.0
            logical name: scsi2
            version: 01
            width: 64 bits
            clock: 33MHz
            capabilities: storage pm msi pciexpress bus_master cap_list rom emulated
            configuration: driver=3w-9xxx latency=0
            resources: irq:16 memory:ec000000-edffffff memory:ea100000-ea100fff ioport:4000(size=256) memory:ea120000-ea13ffff
            *-disk:13
               description: SCSI Disk
               product: 9650SE-16M DISK
               vendor: AMCC
               physical id: 0.6.0
               bus info: scsi@2:0.6.0
               logical name: /dev/sdh
               version: 4.10
               serial: N057468257C49C009001
               size: 2793GiB (2999GB)
               capabilities: gpt-1.00 partitioned partitioned:gpt
               configuration: ansiversion=5 guid=0d8a65b1-a403-4a9d-bf34-eaab03adbee5 logicalsectorsize=512 sectorsize=512
             *-volume
                  description: RAID partition
                  vendor: Linux
                  physical id: 1
                  bus info: scsi@2:0.6.0,1
                  logical name: /dev/sdh1
                  serial: ffb77c6e-f323-471d-8796-eb5037f27b31
                  capacity: 2793GiB
                  capabilities: multi

I think this is the relevant part because it's attached to the only device that mentions "3ware" and it says "disk:13". But, the disk size and volume capacity are much larger than 750 GB so I'm not sure of this. It would make more sense if it were /dev/sdg, /dev/sdj, /dev/sdk, /dev/sdm, /dev/sdn, /dev/sdo, or /dev/sdp (corresponding to md7, md6, or md5). Hopefully smartctl will tell me which it is. But I'll continue as if it's /dev/sdh.

[x] cat /proc/mdstat output:


Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md7 : active raid1 sdo1[0] sdp1[1]
  732277568 blocks super 1.2 [2/2] [UU]

md6 : active raid1 sdm1[0] sdn1[1] 732277568 blocks super 1.2 [2/2] [UU]

md5 : active raid1 sdk1[3] sdj1[2] 732277568 blocks super 1.2 [2/2] [UU]

md3 : active raid1 sdl1[1] sdi1[0] 2929542976 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sdh1[1] sdf1[2] 2929542976 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sdb1[3] sdc1[2] 2929542976 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sde1[2] sdd1[3] 2929542976 blocks super 1.2 [2/2] [UU]

unused devices:



This tells me `/dev/sdh1` is on `/dev/md2`, and so I assume `/dev/sdf1` is its mirror.

Proposal for how to proceed:

1. Record the output of `sudo smartctl -a -d 3ware,13 /dev/twa0`, which should give the current status
2. `sudo su`
3. `smartctl -a -d 3ware,13 /dev/twa0 | grep Current_Pending_Sector` (in case the output from 1 was long)
4. Should I run `smartctl -t long -d 3ware,13 /dev/twa0`? that is in your bash history, but since we already know the error message I'm not sure this is needed.
5. Resolve the pending sectors (by telling whatever's "pending" to fail): `mdadm --manage /dev/md2 --fail /dev/sdh1`
6. Remove the offending disk: `mdadm --manage /dev/md2 --remove /dev/sdh1`
7. Take another snapshot of `cat /proc/mdstat`: This time, I expect it to list `/dev/sdf1` as the only disk on `/dev/md2`, and to list `/dev/sdh` under "unused devices".
8. Override the disk with zeros: `dd if=/dev/zero of=/dev/sdh bs=1M oflag=direct,sync`. This block size should work if it's the same kind of disk as when you did this before, but I should probably look at `parted /dev/sdh` to be sure (I've never used `parted` but I'm familiar with `gparted`). `lshw` did say sector size of 512 and I'm not sure if that's the same thing.
9. On your third bullet point, where is the spare sector being swapped in from? the mirror disk? In any case, I should run `smartctl -a  -d 3ware,13 /dev/twa0` again here and hope the output makes sense.
10. Reformat the disk back: `fdisk /dev/sdh` -> `n` for new partition, `p` for primary, defaults for first and last cylinder, and `w` to write and exit
11. Re-attach it back to RAID: `mdadm --manage /dev/md2 --add /dev/sdh1`
12. Repeat `cat /proc/mdstat` to ensure that everything makes sense.

Does that seem reasonable?

clonm commented 6 years ago

Bump? I think it got worse...

clonm commented 5 years ago

Before:

root@server3:/home/cloyne# sudo smartctl -a -d 3ware,13 /dev/twa0
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-130-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda ES
Device Model:     ST3750640NS
Serial Number:    3QD0B5XJ
Firmware Version: 3.AEE
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Sat Oct 20 11:09:59 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  430) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 202) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   088   006    Pre-fail  Always       -       65372609
  3 Spin_Up_Time            0x0003   087   085   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       135
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       2
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       499263873
  9 Power_On_Hours          0x0032   055   055   000    Old_age   Always       -       39562
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       137
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   061   044   045    Old_age   Always   In_the_past 39 (Min/Max 38/43)
194 Temperature_Celsius     0x0022   039   056   000    Old_age   Always       -       39 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   062   047   000    Old_age   Always       -       228011716
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       2
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     16244         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

clonm commented 5 years ago

dd if=/dev/zero of=/dev/sdh bs=1M oflag=direct,sync status=progress terminated with dd: error writing '/dev/sdh': No space left on device, as expected. smartctl -a after:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   088   006    Pre-fail  Always       -       65372609
  3 Spin_Up_Time            0x0003   087   085   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       135
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       2
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       501992622
  9 Power_On_Hours          0x0032   055   055   000    Old_age   Always       -       39586
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       137
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   062   044   045    Old_age   Always   In_the_past 38 (Min/Max 38/43)
194 Temperature_Celsius     0x0022   038   056   000    Old_age   Always       -       38 (0 13 0 0 0)
195 Hardware_ECC_Recovered  0x001a   068   047   000    Old_age   Always       -       204145678
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       2
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

Since it still says Current_Pending_Sector as 2, I tried smartctl -t long -d 3ware,13 /dev/twa0. The result: # 1 Extended offline Completed: read failure 90% 39593 1465138515

So, I tried dd'ing just that sector, with dd if=/dev/zero of=/dev/ada4 bs=512 count=1 seek=1465138515 conv=noerror,sync. No errors, but the error counts are unchanged. Now running the long test again.

mitar commented 5 years ago

One important aspect is that the write size has to match the sector size. I think your sector size is 4k, not 512B?

mitar commented 5 years ago

Bump? I think it got worse...

And yes, now you also have Offline_Uncorrectable, but the number is still low. But if you see that increasing, then hard drive really became bad. 2 is still an OK number, but yea, ideally it would be all 0.

clonm commented 5 years ago

I was going based on the top of the systemctl output, which says Sector Size: 512 bytes logical/physical. I did also try with bs=4096 and bs=1M, and they all finished without errors. But the second long test still failed on the same sector:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     39845         1465138515
# 2  Extended offline    Completed: read failure       90%     39593         1465138515
# 3  Extended offline    Completed without error       00%     16244         -

clonm commented 5 years ago

root@server3:/proc/sys# dd if=/dev/zero of=/dev/ada4 bs=4096 count=1 seek=1465138515 conv=noerror,sync
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000267878 s, 15.3 MB/s
root@server3:/proc/sys# dd if=/dev/zero of=/dev/ada4 bs=1M count=1 seek=1465138515 conv=noerror,sync
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00212421 s, 494 MB/s
root@server3:/proc/sys# dd if=/dev/zero of=/dev/ada4 bs=4k count=1 seek=1465138515 conv=noerror,sync
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00023148 s, 17.7 MB/s
root@server3:/proc/sys# dd if=/dev/zero of=/dev/ada4 bs=512 count=1 seek=1465138515 conv=noerror,sync
1+0 records in
1+0 records out
512 bytes copied, 0.000233307 s, 2.2 MB/s

clonm commented 5 years ago

wait, sorry, clearly ada4 was a typo. Will retry with correct dev name.

mitar commented 5 years ago

Hm, also I am not sure if LBA address directly translates to seek argument of the dd command?

mitar commented 5 years ago

Oh, one thing. dd fixes Current_Pending_Sector to become Reallocated_Sector_Ct.

And I think self-test fails on Offline_Uncorrectable. Please check on Google. So I am not sure if writing to Offline_Uncorrectable locations helps anything. But that self-test still tries to do anything with Offline_Uncorrectable feels a bit scary to me because does this mean also regular file system might try to write there? I though Offline_Uncorrectable are sectors disk will not use at all and not expose to the system.

Maybe it is time to simply remove this disk if self-test is failing.

clonm commented 5 years ago

I retried all the above steps with the correct drive name (/dev/sdh instead of /dev/ada4), and played around with the block size/seek numbers, but still haven't gotten it to work. I tried re-running dd with bs=512 since that seems to be how the underlying disk self-identifies, but days later it hasn't finished yet and is crawling along at 21.3 kB/s... is that transfer speed in and of itself enough to pronounce it dead?

mitar commented 5 years ago

Lol this is slow. I would suggest we remove it, yes.

cloyne / network

Device: /dev/twa0 [3ware_disk_13], 2 Currently unreadable (pending) sectors #114