uec / Issue.Tracker

Automatically exported from code.google.com/p/usc-epigenome-center
0 stars 0 forks source link

Replace bad drive in Solaris (gastorage2) supermicro chasis 36 drive with LSI controller #732

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
eom

Original issue reported on code.google.com by zack...@gmail.com on 28 Apr 2014 at 8:24

GoogleCodeExporter commented 8 years ago
   raidz2-1                   DEGRADED     0     0     0
            c0t5000C500263C97E3d0    ONLINE       0     0     0
            spare-1                  DEGRADED     0     0     0
              c0t5000C500263CA4D3d0  FAULTED      0     4     0  too many errors
              c0t5000C500263D991Fd0  ONLINE       0     0     0  1.59T resilvered
            c0t5000C500263CAEFBd0    ONLINE       0     0     0
            c0t5000C500263CDCA3d0    ONLINE       0     0     0

zpool detach bigpool c0t5000C500263CA4D3d0

      raidz2-1                 ONLINE       0     0     0
            c0t5000C500263C97E3d0  ONLINE       0     0     0
            c0t5000C500263D991Fd0  ONLINE       0     0     0  1.59T resilvered
            c0t5000C500263CAEFBd0  ONLINE       0     0     0

Original comment by zack...@gmail.com on 5 May 2014 at 6:50

GoogleCodeExporter commented 8 years ago
bash-3.00# iostat -E c0t5000C500263CA4D3d0
sd12      Soft Errors: 0 Hard Errors: 54 Transport Errors: 109
Vendor: SEAGATE  Product: ST32000444SS     Revision: 0006 Serial No:
Size: 2000.40GB <2000398933504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 15 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

Original comment by zack...@gmail.com on 5 May 2014 at 7:00

GoogleCodeExporter commented 8 years ago
disk, instance #12
    Driver properties:
        name='inquiry-serial-no' type=string items=1 dev=none
            value='9WM49WNC0000C1332PPD'
        name='pm-components' type=string items=3 dev=none
            value='NAME=spindle-motor' + '0=off' + '1=on'
        name='pm-hardware-state' type=string items=1 dev=none
            value='needs-suspend-resume'
        name='ddi-failfast-supported' type=boolean dev=none
        name='ddi-kernel-ioctl' type=boolean dev=none
        name='device-nblocks' type=int64 items=1 dev=none
            value=00000000e8e088af
    Hardware properties:
        name='devid' type=string items=1
            value='id1,sd@n5000c500263ca4d3'
        name='inquiry-revision-id' type=string items=1
            value='0006'
        name='inquiry-product-id' type=string items=1
            value='ST32000444SS'
        name='inquiry-vendor-id' type=string items=1
            value='SEAGATE'
        name='inquiry-device-type' type=int items=1
            value=00000000
        name='compatible' type=string items=4
            value='scsiclass,00.vSEAGATE.pST32000444SS.r0006' + 'scsiclass,00.vSEAGATE.pST32000444SS' + 'scsiclass,00' + 'scsiclass'
        name='client-guid' type=string items=1
            value='5000c500263ca4d3'
    Paths from multipath bus adapters:
        mpt_sas#1 (online)
            name='wwn' type=string items=1
                value='5000c500263ca4d3'
            name='lun' type=int items=1
                value=00000000
            name='target-port' type=string items=1
                value='5000c500263ca4d1'
            name='obp-path' type=string items=1
                value='/pci@0,0/pci8086,340e@7/pci1000,3020@0/disk@w5000c500263ca4d1,0'
            name='phy-num' type=int items=1
                value=00000011
            name='path-class' type=string items=1
                value='primary'

Original comment by zack...@gmail.com on 5 May 2014 at 7:11

GoogleCodeExporter commented 8 years ago
downloaded sas2ircu from lsi to blink the drive
bash-3.00# ./sas2ircu LIST
LSI Corporation SAS2 IR Configuration Utility.
Version 11.00.00.00 (2011.08.11)
Copyright (c) 2009-2011 LSI Corporation. All rights reserved.

         Adapter      Vendor  Device                       SubSys  SubSys
 Index    Type          ID      ID    Pci Address          Ven ID  Dev ID
 -----  ------------  ------  ------  -----------------    ------  ------
   0     SAS2008     1000h    72h   00h:06h:00h:00h      1000h   3020h
SAS2IRCU: Utility Completed Successfully.
bash-3.00#
bash-3.00# ./sas2ircu 0 DISPLAY

ead configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type                         : SAS2008
  BIOS version                            : 7.11.00.00
  Firmware version                        : 7.00.00.00
  Channel description                     : 1 Serial Attached SCSI
  Initiator ID                            : 0
  Maximum physical devices                : 255
  Concurrent commands supported           : 2015
  Slot                                    : 5
  Segment                                 : 0
  Bus                                     : 6
  Device                                  : 0
  Function                                : 0
  RAID Support                            : Yes
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
Initiator at ID #0

...
Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 5
  SAS Address                             : 5000c50-0-263c-a4d1
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 1907729/3907029167
  Manufacturer                            : SEAGATE
  Model Number                            : ST32000444SS
  Firmware Revision                       : 0006
  Serial No                               : 9WM49WNC
  GUID                                    : 5000c500263ca4d3
  Protocol                                : SAS
  Drive Type                              : SAS_HDD

...

Original comment by zack...@gmail.com on 5 May 2014 at 7:30

GoogleCodeExporter commented 8 years ago
notice that the LSI derived values from  ./sas2ircu 0 DISPLAY
Serial No                               : 9WM49WNC
 GUID                                    : 5000c500263ca4d3

match the solaris zfs values:
  c0t5000C500263CA4D3d0  FAULTED      0     4     0  too many errors
disk, instance #12
    Driver properties:
        name='inquiry-serial-no' type=string items=1 dev=none
            value='9WM49WNC0000C1332PPD'

So we know we are referring to the same drive

Original comment by zack...@gmail.com on 5 May 2014 at 7:35

GoogleCodeExporter commented 8 years ago
finally we need to blink the bad drive so that we can pull and replace it.

again we use sas2ircu (available at 
http://www.lsi.com/Pages/user/eula.aspx?file=http%3a%2f%2fwww.lsi.com%2fdownload
s%2fPublic%2fHost%2520Bus%2520Adapters%2fHost%2520Bus%2520Adapters%2520Common%25
20Files%2fSAS_SATA_6G_P11%2fSAS2IRCU_P11.zip&Source=http%3a%2f%2fwww.mail-archiv
e.com%2fopenindiana-discuss%40openindiana.org%2fmsg06165.html

bash-3.00# ./sas2ircu 0 LOCATE 2:5 OFF

bash-3.00# ./sas2ircu 0 LOCATE 2:5 ON

bash-3.00# ./sas2ircu 0 LOCATE 2:5 OFF

bash-3.00# ./sas2ircu 0 LOCATE 2:5 ON

Original comment by zack...@gmail.com on 5 May 2014 at 8:15

GoogleCodeExporter commented 8 years ago

Original comment by zack...@gmail.com on 5 May 2014 at 8:32

GoogleCodeExporter commented 8 years ago
I decided to have the hotspare permanently take over the failed drive, and when 
the the new drive arrived to make that become the hotspare.

new drive arrived but since solaris new uses funny labels, I couldnt determine 
the id of the drive to add since it doesnt show up in zpool status.

to solve, "zpool status" output was pasted in excel, then "format" output was 
pasted next to it. Sorted by the disk label to find the new, unused one.

Original comment by zack...@gmail.com on 12 May 2014 at 8:29

Attachments:

GoogleCodeExporter commented 8 years ago
finally add the new device to the pool as a hotspare:

bash-3.00# zpool add bigpool spare c0t5000C5005D51B387d0
bash-3.00# zpool status
  pool: bigpool
 state: ONLINE
 scrub: resilver completed after 32h50m with 0 errors on Mon Apr 28 07:17:37 2014
config:

        NAME                       STATE     READ WRITE CKSUM
        bigpool                    ONLINE       0     0     0
          raidz2-0                 ONLINE       0     0     0
            c0t5000C500263C3FEFd0  ONLINE       0     0     0
            c0t5000C500263C4C0Bd0  ONLINE       0     0     0
            c0t5000C500263C5A53d0  ONLINE       0     0     0
            c0t5000C500263C30D3d0  ONLINE       0     0     0
            c0t5000C500263C30DFd0  ONLINE       0     0     0
            c0t5000C500263C42CBd0  ONLINE       0     0     0
            c0t5000C500263C70CFd0  ONLINE       0     0     0
            c0t5000C500263C82EFd0  ONLINE       0     0     0
            c0t5000C500263C85F3d0  ONLINE       0     0     0
            c0t5000C500263C88FFd0  ONLINE       0     0     0
            c0t5000C500263C96EBd0  ONLINE       0     0     0
          raidz2-1                 ONLINE       0     0     0
            c0t5000C500263C97E3d0  ONLINE       0     0     0
            c0t5000C500263D991Fd0  ONLINE       0     0     0  1.59T resilvered
            c0t5000C500263CAEFBd0  ONLINE       0     0     0
            c0t5000C500263CDCA3d0  ONLINE       0     0     0
            c0t5000C500263CDE9Bd0  ONLINE       0     0     0
            c0t5000C500263D1B53d0  ONLINE       0     0     0
            c0t5000C500263D7C03d0  ONLINE       0     0     0
            c0t5000C500263D9E47d0  ONLINE       0     0     0
            c0t5000C500263D69A3d0  ONLINE       0     0     0
            c0t5000C500263D75B7d0  ONLINE       0     0     0
            c0t5000C500263D222Fd0  ONLINE       0     0     0
          raidz2-2                 ONLINE       0     0     0
            c0t5000C500263DAC7Bd0  ONLINE       0     0     0
            c0t5000C500263EFB7Bd0  ONLINE       0     0     0
            c0t5000C500263F12C7d0  ONLINE       0     0     0
            c0t5000C500263F0337d0  ONLINE       0     0     0
            c0t5000C500263F2057d0  ONLINE       0     0     0
            c0t5000C5002634B12Bd0  ONLINE       0     0     0
            c0t5000C5002635D2DFd0  ONLINE       0     0     0
            c0t5000C5002635E7D7d0  ONLINE       0     0     0
            c0t5000C5002635F4D3d0  ONLINE       0     0     0
            c0t5000C5002639D9B7d0  ONLINE       0     0     0
            c0t5000C500263482FFd0  ONLINE       0     0     0
        spares
          c0t5000C500263D2797d0    AVAIL
          c0t5000C500263609B3d0    AVAIL
          c0t5000C5005D51B387d0    AVAIL

errors: No known data errors

Original comment by zack...@gmail.com on 12 May 2014 at 8:33

GoogleCodeExporter commented 8 years ago

Original comment by zack...@gmail.com on 12 May 2014 at 8:34