marcan / lsirec

LSI SAS2008/SAS2108 low-level recovery tool for Linux
BSD 2-Clause "Simplified" License
194 stars 30 forks source link

i'm... stuck... help? #2

Closed Gartral closed 5 years ago

Gartral commented 5 years ago

I've been trying to follow your instructions for the LSI 9211-8i... and when I run lsiutil -e i get

`LSI Logic MPT Configuration Utility, Version 1.62, January 14, 2009

0 MPT Ports found`

what the heck do I do now? I've been at this for 7 hours trying to just get a dump of my WWID... help...

marcan commented 5 years ago

What firmware are you running? What does lspci say for your card? Does the kernel driver see your card (what driver are you using?)

Gartral commented 5 years ago

the card, on boot shows 2.0.14-0902

0b:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2008 [Falcon] [1000:0073] (rev 02) Subsystem: Fujitsu Technology Solutions RAID Ctrl SAS 6G 0/1 (D2607) [1734:1177] Physical Slot: 2 Flags: bus master, fast devsel, latency 0, IRQ 31 I/O ports at 5000 [size=256] Memory at fbff0000 (64-bit, non-prefetchable) [size=16K] Memory at fbf80000 (64-bit, non-prefetchable) [size=256K] [virtual] Expansion ROM at fbf00000 [disabled] [size=256K] Capabilities: <access denied> Kernel driver in use: megaraid_sas Kernel modules: megaraid_sas

obviously using the megaraid sas driver

marcan commented 5 years ago

That's not a LSI 9211-8i, that's a D2607 using MegaRAID mode. You need to use the hostboot procedure later in the README to convert it to IT/IR. This is untested, good luck and report back if it works for you ;)

To get your WWID first you'll need to use MegaCLI.

Gartral commented 5 years ago

well.. I have other issues too... nothing reads the card right and I have no idea how to compile the lsiutil.. I'm really rather stuck... I can't use the "normal" nethod of crossflashing it either because freedos on a uefi board isn't working (megarec spews a bunch of errors)... I really just want the card to work in IT mode...

marcan commented 5 years ago

You ran lsiutil earlier, so presumably you compiled it? Just follow the instructions in the "MegaRAID to IT/IR firmware" part of the README. The WWID should be on a sticker on the card anyway, or worst case just make up some WWID that wasn't the original one.

Gartral commented 5 years ago

so you're right, I'm so frazzeled I don't remember what worked and what doesn't...

now I'm getting sudo

./lsirec 01:00.0 unbind open bar1: No such file or directory

marcan commented 5 years ago

Your card is at 0000:0b:00.0, not 0000:01:00.0. Calm down and read the README carefully. You can cause damage to other devices in your system if you use the wrong PCI device address. This isn't something you want to blindly rush through.

Gartral commented 5 years ago

ok, so I actually ended up using the EFI method for the card I NEEDED to make work, so I switched gears and I have a 9212-4i that i'm trying to flash to it mode now

01:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) Subsystem: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:30f0] Flags: bus master, fast devsel, latency 0, IRQ 18 I/O ports at e000 [size=256] Memory at feac0000 (64-bit, non-prefetchable) [size=16K] Memory at fea80000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at fea00000 [disabled] [size=512K] Capabilities: [50] Power Management version 3 Capabilities: [68] Express Endpoint, MSI 00 Capabilities: [d0] Vital Product Data Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [138] Power Budgeting <?> Kernel driver in use: mpt3sas Kernel modules: mpt3sas

which I was able to flash the SBR too... but lsiutil -e reports

`LSI Logic MPT Configuration Utility, Version 1.62, January 14, 2009

0 MPT Ports found`

marcan commented 5 years ago

Did you hostboot it? If you only flashed the SBR the card will not work until you perform a hostboot, then you can use lsiutil to flash the firmware.

Please follow the instructions carefully and report exactly what you're doing.

ezonakiusagi commented 5 years ago

@marcan just to contribute to this discussion, I've followed your instructions for the MegaRAID->IT firmware flash it worked! I've done it several times now, so I can confirm it does work. I recommend @Gartral read over those steps carefully because it does work, but see my remarks below.

a few things to note though:

1) after hostboot, it might take a few seconds for the MPT port to become available. I have to poll status on this before proceeding to next step. 2) after the firmware flash, the reset that follows that does not always succeed. about half the time, it results in IOC in FAULT state. however, another reset fixes this and it goes IOC in READY state. then, I can proceed.

@marcan where did find the information on how to do the hostboot? if there's a developer's guide, would you mind sharing?

marcan commented 5 years ago

There is no documentation, I had to piece it together from reverse engineering LSI tools and looking at the register names in the kernel drivers.

Gartral commented 5 years ago

@ezonakiusagi @marcan

yes I have hostbooted the card, I've been reading carefully,... soo I'll walk through what I'm seeing...

hugetables are enabled:

# cat /proc/sys/vm/nr_hugepages 16

there are no IOMMU groups

lspci -vnn shows me the card is 01:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) Subsystem: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:30f0] Flags: bus master, fast devsel, latency 0, IRQ 18 I/O ports at e000 [size=256] Memory at feac0000 (64-bit, non-prefetchable) [size=16K] Memory at fea80000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at fea00000 [disabled] [size=512K] Capabilities: [50] Power Management version 3 Capabilities: [68] Express Endpoint, MSI 00 Capabilities: [d0] Vital Product Data Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [138] Power Budgeting <?> Kernel driver in use: mpt3sas Kernel modules: mpt3sas

when I run ./lsirec 0000:01:00.0 readsbr sbr_backup.bin i get the following file

sbr_backup.zip

and the card appears to be in IR mode.. this is corroborated by, at boot, the card shows up, in IR mode...

the SBR I'm trying to write is the following:

sbr_new.zip

# ./lsirec 0000:01:00.0 unbind Device in MPT mode

# ./lsirec 0000:01:00.0 halt Device in MPT mode

#./lsirec 0000:01:00.0 writesbr sbr_new.bin Device in MPT mode Using I2C address 0x50 Using EEPROM type 1 Writing SBR... SBR written from sbr_new.bin

so far, so good...

# ./lsirec 0000:01:00.0 hostboot 214i4et-it.bin Device in MPT mode Resetting adapter in HCB mode... Trying unlock in MPT mode... Device in MPT mode IOC is RESET Setting up HCB... HCDW virtual: 0x7fdd03000000 HCDW physical: 0x3df200000 Loading firmware... Loaded 722644 bytes Booting IOC... IOC is READY IOC Host Boot successful.

I went to relieve myself at this point, just to give it time.. this here is where things start to break down...

# lspci -vns 0000:01:00.0 -A linux-sysfs | head -n 2 01:00.0 0107: 1000:0072 (rev 03) Subsystem: 1000:30f0

# lspci -vns 0000:01:00.0 -A intel-conf1 | head -n 2 01:00.0 0107: 1000:0072 (rev 03) Subsystem: 1000:30f0

these are supposed to be different, are they not, or is my understanding of the PID/VID at this point.... off?

ok, well, soldiering on because I've already done this all previously, with no ill effect to the card

# ./lsirec 0000:01:00.0 rescan Device in MPT mode Removing PCI device... Rescanning PCI bus... PCI bus rescan complete.

# lspci -vns 0000:01:00.0 -A linux-sysfs | head -n 2 01:00.0 0107: 1000:0072 (rev 03) Subsystem: 1000:30f0

huuuh... so it's been in MPT mode, the whole time?

`# lsiutil -e

LSI Logic MPT Configuration Utility, Version 1.62, January 14, 2009

0 MPT Ports found `

so... what's going on here? I obviously haven't busted it... the card is still read as an LSI card.. and the green blinky light is green and blinky.. (it turned amber when in the RESET state, as I assume is expected) I'll freely admit I may be running up against my own incompetence with low level hardware flashing, but my gut feeling says something ELSE is amiss here.

marcan commented 5 years ago

Once you flash the new SBR, the card will switch to MPT mode. If you have rebooted or rescanned since the first time you flashed the SBR, then the card will already be in MPT mode. The VID will only change the first time you switch from iMR mode to IT/IR mode. So after the first time you do all of this, the VID will already be what it should be and redoing it won't cause it to change any more.

At this point you need to check the kernel log (dmesg) and figure out what the mpt3sas driver thinks of your card, particularly after a hostboot and rescan. It sounds like either the kernel driver does not like the state your card is in, or lsiutil doesn't like your kernel driver.

Gartral commented 5 years ago

ok, now I'm really confused i think it's IN IT mode but reporting IR somehow?

[70736.542623] mpt2sas_cm0: diag reset: SUCCESS
[70835.462669] pci 0000:01:00.0: [1000:0072] type 00 class 0x010700
[70835.462689] pci 0000:01:00.0: reg 0x10: [io  0xe000-0xe0ff]
[70835.462699] pci 0000:01:00.0: reg 0x14: [mem 0xfeac0000-0xfeac3fff 64bit]
[70835.462708] pci 0000:01:00.0: reg 0x1c: [mem 0xfea80000-0xfeabffff 64bit]
[70835.462719] pci 0000:01:00.0: reg 0x30: [mem 0xfea00000-0xfea7ffff pref]
[70835.462764] pci 0000:01:00.0: supports D1 D2
[70835.462783] pci 0000:01:00.0: reg 0x174: [mem 0x00000000-0x00003fff 64bit]
[70835.462785] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x0003ffff 64bit] (contains BAR0 for 16 VFs)
[70835.462794] pci 0000:01:00.0: reg 0x17c: [mem 0x00000000-0x0003ffff 64bit]
[70835.462796] pci 0000:01:00.0: VF(n) BAR2 space: [mem 0x00000000-0x003fffff 64bit] (contains BAR2 for 16 VFs)
[70835.466172] pci 0000:01:00.0: BAR 6: assigned [mem 0xfea00000-0xfea7ffff pref]
[70835.466176] pci 0000:01:00.0: BAR 3: assigned [mem 0xfea80000-0xfeabffff 64bit]
[70835.466183] pci 0000:01:00.0: BAR 9: no space for [mem size 0x00400000 64bit]
[70835.466186] pci 0000:01:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit]
[70835.466188] pci 0000:01:00.0: BAR 1: assigned [mem 0xfeac0000-0xfeac3fff 64bit]
[70835.466195] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00040000 64bit]
[70835.466196] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit]
[70835.466198] pci 0000:01:00.0: BAR 0: assigned [io  0xe000-0xe0ff]
[70835.466206] pci 0000:00:14.4: PCI bridge to [bus 05]
[70835.467843] mpt3sas 0000:01:00.0: PCI->APIC IRQ transform: INT A -> IRQ 18
[70835.467856] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (15611460 kB)
[70835.538850] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[70835.538859] mpt2sas_cm1: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: -1
[70835.538984] mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 32
[70835.538986] mpt2sas_cm1: iomem(0x00000000feac0000), mapped(0x0000000031efeff8), size(16384)
[70835.538987] mpt2sas_cm1: ioport(0x000000000000e000), size(256)
[70835.610821] mpt2sas_cm1: CurrentHostPageSize is 0: Setting default host page size to 4k
[70835.657696] mpt2sas_cm1: Allocated physical memory: size(7579 kB)
[70835.657699] mpt2sas_cm1: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432)
[70835.657700] mpt2sas_cm1: Scatter Gather Elements per IO(128)
[70835.715962] mpt2sas_cm1: overriding NVDATA EEDPTagMode setting
[70835.716373] mpt2sas_cm1: LSISAS2008: FWVersion(20.00.06.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[70835.716374] mpt2sas_cm1: Protocol=(
[70835.716374] Initiator
[70835.716375] ,Target
[70835.716375] ), 
[70835.716375] Capabilities=(
[70835.716376] TLR
[70835.716376] ,EEDP
[70835.716377] ,Snapshot Buffer
[70835.716377] ,Diag Trace Buffer
[70835.716378] ,Task Set Full
[70835.716378] ,NCQ
[70835.716378] )
[70835.716435] scsi host1: Fusion MPT SAS Host
[70835.718462] mpt2sas_cm1: sending port enable !!
[70838.344387] mpt2sas_cm1: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[70843.467585] mpt2sas_cm1: port enable: SUCCESS
[71056.017112] mpt2sas_cm1: sending message unit reset !!
[71056.019941] mpt2sas_cm1: message unit reset: SUCCESS
[73355.742263] pci 0000:01:00.0: [1000:0072] type 00 class 0x010700
[73355.742298] pci 0000:01:00.0: reg 0x10: [io  0xe000-0xe0ff]
[73355.742308] pci 0000:01:00.0: reg 0x14: [mem 0xfeac0000-0xfeac3fff 64bit]
[73355.742317] pci 0000:01:00.0: reg 0x1c: [mem 0xfea80000-0xfeabffff 64bit]
[73355.742328] pci 0000:01:00.0: reg 0x30: [mem 0xfea00000-0xfea7ffff pref]
[73355.742375] pci 0000:01:00.0: supports D1 D2
[73355.742394] pci 0000:01:00.0: reg 0x174: [mem 0x00000000-0x00003fff 64bit]
[73355.742396] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x0003ffff 64bit] (contains BAR0 for 16 VFs)
[73355.742404] pci 0000:01:00.0: reg 0x17c: [mem 0x00000000-0x0003ffff 64bit]
[73355.742406] pci 0000:01:00.0: VF(n) BAR2 space: [mem 0x00000000-0x003fffff 64bit] (contains BAR2 for 16 VFs)
[73355.746581] pci 0000:01:00.0: BAR 6: assigned [mem 0xfea00000-0xfea7ffff pref]
[73355.746584] pci 0000:01:00.0: BAR 3: assigned [mem 0xfea80000-0xfeabffff 64bit]
[73355.746590] pci 0000:01:00.0: BAR 9: no space for [mem size 0x00400000 64bit]
[73355.746592] pci 0000:01:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit]
[73355.746593] pci 0000:01:00.0: BAR 1: assigned [mem 0xfeac0000-0xfeac3fff 64bit]
[73355.746599] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00040000 64bit]
[73355.746600] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit]
[73355.746602] pci 0000:01:00.0: BAR 0: assigned [io  0xe000-0xe0ff]
[73355.746607] pci 0000:00:14.4: PCI bridge to [bus 05]
[73355.746760] mpt3sas 0000:01:00.0: PCI->APIC IRQ transform: INT A -> IRQ 18
[73355.746770] mpt2sas_cm2: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (15611460 kB)
[73355.818320] mpt2sas_cm2: CurrentHostPageSize is 0: Setting default host page size to 4k
[73355.818329] mpt2sas_cm2: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: -1
[73355.818492] mpt2sas2-msix0: PCI-MSI-X enabled: IRQ 32
[73355.818494] mpt2sas_cm2: iomem(0x00000000feac0000), mapped(0x000000003116e5d9), size(16384)
[73355.818495] mpt2sas_cm2: ioport(0x000000000000e000), size(256)
[73355.890288] mpt2sas_cm2: CurrentHostPageSize is 0: Setting default host page size to 4k
[73355.937089] mpt2sas_cm2: Allocated physical memory: size(7579 kB)
[73355.937091] mpt2sas_cm2: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432)
[73355.937092] mpt2sas_cm2: Scatter Gather Elements per IO(128)
[73355.995386] mpt2sas_cm2: overriding NVDATA EEDPTagMode setting
[73355.995797] mpt2sas_cm2: LSISAS2008: FWVersion(20.00.06.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[73355.995797] mpt2sas_cm2: Protocol=(
[73355.995798] Initiator
[73355.995798] ,Target
[73355.995799] ), 
[73355.995799] Capabilities=(
[73355.995800] TLR
[73355.995800] ,EEDP
[73355.995801] ,Snapshot Buffer
[73355.995801] ,Diag Trace Buffer
[73355.995801] ,Task Set Full
[73355.995802] ,NCQ
[73355.995802] )
[73355.995858] scsi host1: Fusion MPT SAS Host
[73355.996079] mpt2sas_cm2: sending port enable !!
[73358.585119] mpt2sas_cm2: host_add: handle(0x0001), sas_addr(0x5000000080000000), phys(8)
[73363.711069] mpt2sas_cm2: port enable: SUCCESS
marcan commented 5 years ago

Please use ``` to bracket your code, otherwise the lines get squashed together.

It seems the card is fine but lsiutil does not like it for some reason. Where did you get the SBR that you're using? Have you tried using one of the SBRs I provided?

You can run strace -f ./lsiutil -e 2>strace.log and attach strace.log to try to figure out what's going on.

Gartral commented 5 years ago

yes, I used the 9211 example SBR (adding in my own SAS WWID)

here's the log
strace.log

marcan commented 5 years ago

The 9211 example SBR looks quite different from the one you posted:

 HwConfig = 0x0107
 SubsysVID = 0x1000
-SubsysPID = 0x3020
+SubsysPID = 0x30f0
 Unk18 = 0x00000000
 Unk1c = 0x00000000
 Unk20 = 0x00000000
@@ -19,7 +19,8 @@
 Unk3c = 0x00000000
 Interface = 0x00
 Unk41 = 0x0c
-Unk42 = 0x005d
-Unk44 = 0x145a305c
-Unk48 = 0x0575
+Unk42 = 0x0000
+Unk44 = 0x00000000
+Unk48 = 0x0000
 Unk4a = 0x10
+SASAddr = 0x5002e10000000000

Unk42,44,48 are different (you have them zeroed out), and the SubsysPID is different.

Either way, it seems your version of lsiutil is different. The one I have (and linked to) is is 1.72. I think 1.62 does not support the mpt3sas driver. Make sure you grab and compile the lsiutil that I linked to, not any other.

Gartral commented 5 years ago

ok, wow... I have no idea how I ended up with the older lsiutil, NOW it works >.> so, what, in your opinion, because somehow I messed up my SBR, would be the CORRECT base SBR to flash?

marcan commented 5 years ago

If the SBR you have now works then you might as well keep it, but if it doesn't or you have any issues you can try just using sbr_sas9211-8i_itir.cfg (build it with sbrtool).

Gartral commented 5 years ago

ok, I'll keep my current one, as it doesn't seem to be causing issues.. I'll continue the flash and if I run into any non-SBR issues I'll continue this thread... thank you so so much for your patience with me XD I realize I can be a bit thick headed.. I try not to be.. but the character fault slips through.

last Q though... do you have a paypal linked to your email address that's on github?

marcan commented 5 years ago

I don't, sorry. I try to avoid PayPal as much as I can.

ezonakiusagi commented 5 years ago

@marcan @Gartral btw, the "HwConfig" in the SBR config file is the "PCI Class ID", and according to the database here:

https://pci-ids.ucw.cz/read/PD/01

It is 0x0104 for "RAID bus controller" and 0x0107 for "Serial Attached SCSI controller".

What I've noticed is that if you don't have the matching firmware to what PCI Class ID is in the SBR, some times during firmware bootup, it seems the firmware will correct this part of the SBR (maybe copies it from backup SBR?). I've literally flashed SBR with 0x0107, read it back out to confirm, and then at some point it reverts back to 0x0104. So then, I have to flash the same SBR again to force it back to 0x0107. Not all firmwares do this, but I've seen this happen several times now. IT mode firmware will not work correctly if the PCI Class ID is not set correctly.

marcan commented 5 years ago

Hm, this doesn't explain why my Fujitsu card needs 0x04 instead of 0x07 for half the ports to work. I should run some more tests on that card.

Anyway, I think there's no issue here so I'm closing this thread. Let's not turn this into a general discussion.

anch2150 commented 5 years ago

Hi marcan,

Thank you so much for this great tool! I wanted to report back that I successfully converted a Fujitsu D2607 card to IT mode following your instructions. Previously tried megarec.exe, sas2flsh.exe and various combinations of SBR/Firmware with no avail. The lsirec/lsiutil worked like a charm.

My environment:

I did run into an error in the Final section.

# ./lsirec 0000:01:00.0 reset
# ./lsirec 0000:01:00.0 info

The IOC appeared to be "faulty". After

# ./lsirec 0000:01:00.0 rescan

it's operational again though.

In syslog I found

Mar 23 20:45:27 ubuntu-server kernel: mpt3sas 0000:07:00.0: can't disable ASPM; OS doesn't have ASPM control
Mar 23 20:45:27 ubuntu-server kernel: mpt2sas_cm2: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (12248700 kB)
Mar 23 20:45:27 ubuntu-server kernel: mpt2sas_cm2: fault_state(0x0704)!
Mar 23 20:45:27 ubuntu-server kernel: mpt2sas_cm2: sending diag reset !!