marcan / lsirec

LSI SAS2008/SAS2108 low-level recovery tool for Linux
BSD 2-Clause "Simplified" License
191 stars 30 forks source link

reading SBR from 2208 shows the data to be offset by a few bytes #1

Open ezonakiusagi opened 5 years ago

ezonakiusagi commented 5 years ago

I know this lsirec tool was not intended to be used with LSI SAS2208 chipset, but since there are lots of similarities between 2008/2108 and the 2208/2308, i tried using lsirec to read the SBR off a 2208 card. The tool complained that there were 2 different copies and it chose to use the 1st one. When I used your 'sbrtool parse' command to read the SBR, all the segments looked wrong. But when I looked closer, I found that everything was offset by a few bytes, .e.g, i found the 0x0107 for PCI device type, but it was not where it should be located.

I haven't read your code to see how you are reading the SBR, but do you know what might cause this?

galaris commented 4 years ago

It uses a slightly different sbr, which my full guide has. Just follow the instructions: https://fohdeesha.com/docs/perc/

I followed your guide and was able to flash a H710p Mini in my R720xd without any issues, thank you so much!

bcraft1901 commented 4 years ago

hi @Fohdeesha I have followed your fantastic guide on flashing the Dell H710 D1 Mini to IT mode for which I thank you. Whilst the flashing was very straightforward and im able to install and boot centos 8, once loaded the console shows many "bad sector errors", the ssd is brand new and i have put it in another machine to verify it is working correctly, which it is however i cannot get away from this fault when installed in my R620. Are you able to suggest any way i can debug this or if its a known issue?

Thanks, Ben.

bcraft1901 commented 4 years ago

Hi - just in addition to the above I have attached a screenshot taken today.

image

Fohdeesha commented 4 years ago

@bcraft1901 That's definitely not a known issue and totally unrelated to the flash process - if the ssd does not throw these errors on a different *nix system, it could be the mini mono card is not fully seated correctly (try reseating it), or the backplane cables could not be seated or faulty

UnixRonin commented 2 years ago

I have one question for you which I hope is straightforward. I'm looking at a Dell R720 which I plan to reflash to IT/JBOD to use with ZFS. It is a 16-drive chassis. I know that it uses the H710P controller, though I don't yet know what revision.

I see from its specs that the H710P supports up to 32 drives. Does it STILL support 32 drives after flashing the IT firmware?

Just constructive paranoia here. I'd hate to buy a 16-bay server and only be able to use 8 of them.

kim-bjoern commented 2 years ago

I’m quiet certain you’ll be fine. One of my servers is a R720 SFF with an IT flashed H710 mini. I haven’t had both expanders filled with disks, but 12 SSD’s are running just fine.

Sent from the palm of my hand!

/Kim Bjoern

Den 4. jan. 2022 kl. 00.00 skrev Phil Stracchino @.***>:

 I have one question for you which I hope is straightforward. I'm looking at a Dell R720 which I plan to reflash to IT/JBOD to use with ZFS. It is a 16-drive chassis. I know what it uses the H710P controller, though I don't yet know what revision.

I see from its specs that the H710P supports up to 32 drives. Does it STILL support 32 drives after flashing the IT firmware?

Just constructive paranoia here. I'd hate to buy a 16-bay server and only be able to use 8 of them.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.

devinirv commented 2 years ago

The dell r720 fully supports 16 ssd and hdd through the mini card built in in IT mode, there should be a quick flashing guide that is posted here graciously by Fohdeesha, who has worked hard to find an easy 123 click solution rather than the harder way we have been flashing these cards before. i have successfully flashed over 16 regular dell perc710s and various other dell cards. The onboard daughter card in the dell r720 and 730 are no different. I am currently running a Dell r720xd with the onboard perc as well as the dual 10gb onboard nic card with 16 lff drives and gtx 1070 as well as a second lsi megarec card and another 4gb nic as well as the 2.5 inch ssd slot in the back with some micron slc nand flash ssds. these systems are great for home servers and cheap, they are however starting to get dated. one thing to keep in mind if the acoustic level is dependent on the cpu tdp.

UnixRonin commented 2 years ago

Thank you both! Yes, I came here FROM Fohdeesha's quick click solution.

So that would be a H720P mini then?

Background: My SunFire X4540 failed to come back after a power outage (despite being cleanly shut down while still on UPS power), an attempt to replace it with a current-model QNAP NAS was one of the more horribly disappointing experiences I can remember, and I'm looking at frobbing an R720 into a proper replacement. It has fewer drive bays than the X4540 of course, but also lacks the X4540's 2TB drive limit, so I should be able to have a lot more storage for a lot less power draw. I'm looking at the SFF model though.

devinirv commented 2 years ago

It's a personal choice to go sff but I came from having a cisco ucs c240 sff and it was hard to source large capacity 2.5 inch disk if not going all ssd. Largest I managed to aquire was 4tb Seagate and these are eco models from portable enclosures. It was just to costly and economically and logistically not feasible for a Nas enclosure. Atleast with lff 16 bay you can also put in 2.5 as the dell caddies do support mounting 2.5inch ssd and hdd. The backplane also supports larger than 2tb and the perc710 mono supports larger than 2tb disks. The dell raid card in the r720 should be a mini monolithic card. Best of luck.

Fohdeesha commented 2 years ago

Guys, this is not the place to discuss crossflashing or dells, this is Marcan's lsirec repo, and a completely unrelated github issue. Everyone is just spamming his inbox by using this to discuss my mostly unrelated guide. For guide discussion choose any one of the following:

https://forums.servethehome.com/index.php?threads/guide-flashing-h310-h710-h810-mini-full-size-to-it-mode.27459/ https://www.truenas.com/community/threads/guide-flashing-h310-h710-mini-to-it-mode.82196/ https://github.com/Fohdeesha/lab-docu/issues

walterav1984 commented 7 months ago

Small success with careless cross flashing the Fujitsu D3116C1 1GB from "iMR" to "IT-mode":

$ lspci #output edited
#original
27:00.0 0107: 1000:005b (rev 05)
    Subsystem: 1734:11e4
27:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2208 [Thunderbolt] (rev 05)
    Subsystem: Fujitsu Technology Solutions MegaRAID SAS 2208 [Thunderbolt]

#post crossflash sbr
27:00.0 0107: 1000:0087 (rev 05)
    Subsystem: 1028:1f34
27:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
    Subsystem: Dell SAS2308 PCI-Express Fusion-MPT SAS-2

#post firmware it-mode flash sas2hax.efi
27:00.0 0107: 1000:0087 (rev 05)
    Subsystem: 1000:3020    
27:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
    Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA

Similar to the other 2208 cards mentioned earlier it was hard to find a compatible MegaRec/Cli tool that was compatible with the 512 bytes sbr and 16MB flash. Also the original and modified sas2flash utils and lsiutil didn't recognize the card before modifying the sbr!

The following Youtube URL from AGTech suggested the right Megarec.exe from HP which detects card and could read/backup/write sbr and erase whole 16MB and MegaCLI.exe from Broadcom for backing up txt info.

md5sum MegaRec.exe
5588ac9f07f67bb8a2c61a2995230714

md5sum MegaCLI.exe
8c1d85401dbc27ba64605e01c626b5d9

#backing up original Fuji D3116C1 sbr/spd/info from FreeDOS
megarec.exe -readsbr 0 fd3116sbr.bin
megarec.exe -readspd 0 fd3116spd.bin
megacli.exe -adpallinfo -a0 > fd3116cli.txt

After making a backup of the sbr, spd and cli-info of the original Fujitsu roms with both MegaRec/CLI.exe and lsirec, I acknowledged that I simply didn't know how to backup the full 8/16 MB ROM without a functioning sas2flash which I successfully did before with other original LSI cards...

By modifying lsirec.c and changing [256] into [512] at line 667 and 694, lsirec started to read/backup and write the same data/checksums from within Linux as the original MSDOS MegaRec.exe tool did while using FreeDOS.

So I dared to flash the sbr using lsirec with a Dell 710PD1md.sbr from @Fohdeesha since the Dell card shared a lot similarities with the Fujitsu using marcan's lsirec instructions by using hostboot mode.

#Ubuntu Linux 23.10
# echo 16 > /proc/sys/vm/nr_hugepages #as root user!

$ sudo ./lsirec 0000:27:00.0 info
Device in MPT mode
Registers:
 DOORBELL:       0x00000000
 DIAG:           0x000001b2
 DCR_I2C_SELECT: 0xc003ff0a
 DCR_SBR_SELECT: 0xb5000009
 CHIP_I2C_PINS:  0x00000003
IOC is RESET

$ sudo ./lsirec 0000:27:00.0 unbind
Trying unlock in MPT mode...
Device in MPT mode
Kernel driver unbound from device

$ sudo ./lsirec 0000:27:00.0 halt
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET 

$ sudo ./lsirec 0000:27:00.0 readsbr sbrfuji_backup512.bin

$ md5sum  710PD1md.sbr
44e32ff0415e48c05598bae78f77c32d

$ sudo ./lsirec 0000:27:00.0 writesbr 710PD1md.sbr
success

$ sudo ./lsirec 0000:27:00.0 hostboot 9207-8.bin
Device in MPT mode
Resetting adapter in HCB mode...
Trying unlock in MPT mode...
Device in MPT mode
IOC is RESET 
Setting up HCB...
HCDW virtual: 0x7f7075e00000
HCDW physical: 0x161400000
Loading firmware...
Loaded 809436 bytes
Booting IOC...
IOC is READY 
IOC Host Boot successful.

$ lspci -vns 0000:27:00.0 -A linux-sysfs | head -n 2
27:00.0 0104: 1000:005b (rev 05)
    Subsystem: 1734:11e4

#nothing shows...
$ lspci -vns 0000:27:00.0 -A intel-conf1 | head -n 2
$ lspci -vns 0000:27:00.0 -H1 | head -n 2

$ sudo lspci -vns 0000:27:00.0
27:00.0 0104: 1000:005b (rev 05)
    Subsystem: 1734:11e4
    Physical Slot: 7
    Flags: bus master, fast devsel, latency 0, IRQ 174, NUMA node 1
    I/O ports at 8000 [size=256]
    Memory at fbef0000 (64-bit, non-prefetchable) [size=16K]
    Memory at fbe80000 (64-bit, non-prefetchable) [size=256K]
    Expansion ROM at fbe00000 [virtual] [disabled] [size=128K]
    Capabilities: [50] Power Management version 3
    Capabilities: [68] Express Endpoint, MSI 00
    Capabilities: [d0] Vital Product Data
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [c0] MSI-X: Enable- Count=16 Masked-
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [1e0] Secondary PCI Express
    Capabilities: [1c0] Power Budgeting <?>
    Capabilities: [190] Dynamic Power Allocation <?>
    Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
    Kernel modules: megaraid_sas

#not sure anymore if I did a modprobe -r megaraid_sas before
$ sudo ./lsirec 0000:27:00.0 rescan
Device in MPT mode
Removing PCI device...
Rescanning PCI bus...
PCI bus rescan complete.

$ sudo lspci -vns 0000:27:00.0
27:00.0 0107: 1000:0087 (rev 05)
    Subsystem: 1028:1f34
    Physical Slot: 7
    Flags: fast devsel, IRQ 174, NUMA node 1
    I/O ports at 8000 [size=256]
    Expansion ROM at fbe00000 [virtual] [disabled] [size=1M]
    Capabilities: [50] Power Management version 3
    Capabilities: [68] Express Endpoint, MSI 00
    Capabilities: [d0] Vital Product Data
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
    Capabilities: [c0] MSI-X: Enable- Count=16 Masked-
    Capabilities: [100] Advanced Error Reporting
    Capabilities: [1e0] Secondary PCI Express
    Capabilities: [1c0] Power Budgeting <?>
    Capabilities: [190] Dynamic Power Allocation <?>
    Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
    Kernel modules: mpt3sas

After noticing this success, I rebooted the machine and started to repeat the same (cross)flash steps in FreeDOS with MegaRec.exe and the Dell modified sbr and wiping the whole 16MB flash.

#FreeDOS
megarec.exe -writesbr 0 710PD1md.sbr
megarec.exe -cleanflash 0

Than used a UEFI computer with a modified sas2hax.efi, since original MSDOS sas2flash failed with Mfg / PAL errors, than flashed stock "9207-8.bin" and altered sas-address.

#efi shell
fs0:
sas2hax.efi -o -f 9207-8.bin

#md5sum of original lsi firmware
532ece37507a51feb8e78ff94ddfdc83  9207-8.bin

After this real flash, original sas2flash and lsiutil programs start to recognize the card ootb and can dump/write parts of the firmware so I can verify the flash was successful by comparing the md5sum of the it-mode firmware file.

The card "somehow" functions now in IT-mode with mpt3sas instead of megaraid_sas and detects disks on multiple ports but not all connectors (not sure if it did before with original fuji firmware). If I read marcan's original instructions there is more to this and probably a real checksum matched from the original fujitsu sbr could make a difference, but also (re)writing the SBR again after flashing to it-mode firmware could improve port detection... You can see that after flashing the it-mode firmware the SBR changes again like the device-id from Dell changed to LSI 9207-8i, not sure if that was intended.

~@Fohdeesha could you hint how you changed the Dell SBR with correct checksumming, Id rather reflash and verify with a modified Fujitsu sbr than Dell one?~ Understood now how to calculate checksum. For 512 bytes eeprom the 2 repeated checksum value is "a+b=c" where "c=2a5b", and "a=bytes 0-222" "b=byte 223". Using mate-calculater in scientific mode it shows a "mod" button which results in "2a5b mod 100 = 5b" as opposed to 256 bytes version where it is "a5b" discovered by marcan. When doing "c - b=a" where "a" represents the value of bytes "0-222" which are gonna be modified because of changing "device-id" byte and "imr/it-ir-mode" byte can be known. For me 2a5b - a2 = 29b9 = old "a" value of bytes 0-222, than minus old device id value "5b" plus new device-id "87" minus old "01" plus new value "00" creates a new valua for "a" and than "c" minus new "a" gives new checksum value "b"= 77 in my case.

Made a minimal 3 byte changed modified original fujid3116sbr (6bytes total because 2 times editing repeated data): changed device id at offset 20 from "5b > "87" changed offset 96 from "01" > "00" probably imr to it/ir mode changed checksum byte at offset 223 " from "a2" > "77" fix checksum

This file is called "mod3116sbr.bin"

After these changes, the card still behaved the same as dell 710 firmware, but I narrowed the port detection issue down to 1 dead port (SAS-MCL1, hdd connector 3 at top of the card, SAS-MCL2 connector at bottom near motherboard is fine) and 1 sata 1.0 hdd 100GB disk that doesn't work at all on any of my other original lsi cards.

In the order of "rinse and repeat" from what I can still remember and reconstruct I finally dared to introduce a self constructed 512 bytes "demptysbr.bin" as a first step before cross-flashing inspired by most online cross flash tutorials. By hex editing a 512 byte file to have only 00 from bytes 0-447 and only FF from 448-511 you can create a "demptysbr.bin" file. Which resulted after "cleanflash 0" and reboot in a raid card that has device-id "1000:0089" which can only be used for recovering by "MegaRec.exe" by supplying again a working SBR, since lsirec, lsu-util or any sas2xxx tool won't recognize the card anymore after that...

#FreeDOS
megarec.exe -writesbr 0 demptysbr.bin
megarec.exe -cleanflash 0

#reboot to FreeDOS 
megarec.exe -writesbr 0 mod3116sbr.bin

#reboot again to efi shell
fs0:
sas2hax.efi -o -f 9207-8.bin

#reboot to FreeDOS again it may fix the port detection issue because of not yet understood reasons...
megarec.exe -writesbr 0 mod3116sbr.bin

My guess is that this 2nd hand Fujitsu D3116C1 card was shipped with a single dead hdd connector instead of the crossflash procedure (could not verify before flash), and using the 1.0 sata hdd as test disk gave me the wrong idea that more ports were not working... If this dead port is used, it can kernel panic the mpt3sas driver, or halt the system from booting! Like normal LSI card, led is also green blinking, smart status works and normal IO tested no heavy benchmarking yet! Also removing the 1GB extension module on the card has no effect, currently running without it!

walterav1984 commented 6 months ago

For now 7 ports seem to work from the 8 on the Fujitsu D3116C1 card, updated my previous post and my guess is that crossflash was successfull but the Fujitsu card has 1 broken connector and I was using a too old sata 1.0 HDD to verify function of the other remaining ports!