corna / me_cleaner

Tool for partial deblobbing of Intel ME/TXE firmware images
GNU General Public License v3.0
4.46k stars 277 forks source link

SuperMicro A2SDi doesn't work #189

Open lasalvavida opened 6 years ago

lasalvavida commented 6 years ago

https://www.supermicro.com/products/motherboard/atom/A2SDi-2C-HLN4F.cfm

SPS version: 4.0.4.139

Have tried soft disable, code removal, and the combination of the two.

Soft disable doesn't appear to work at all here. The current state is still reported as Operational.

Code removal causes the board to enter a state where power is off and stays off even if the button is pressed, or a power command is issued over IPMI.

Have also tried --keep-modules, and using --whitelist to attempt removing only single partitions. These exhibit various behaviors, some where the board stays off as described above, some where the board comes up but never POSTs, staying at code 0xff, and some that go through a few codes and then get stuck at 0xad. None of these boot paths successfully initialize VGA.

I don't believe that this board has Intel Boot Guard, but the inability to remove anything from the ME section makes me think that either something about how me_cleaner modifies the image is failing some kind of validation, or that SuperMicro has made some kind of configuration/code change in SEC or PEI that requires ME to be present and functional.

@corna, any ideas?

corna commented 6 years ago

Great, finally I have the chance to work on SPS 4.x ;)

Note that --whitelist only adds some modules to the whitelist, if you want to remove only a partition you have to use --blacklist.

I'm going to analyze the SPS image (which, luckily, is available on the supermicro website), I'll keep you updated.

lasalvavida commented 6 years ago

Note that --whitelist only adds some modules to the whitelist, if you want to remove only a partition you have to use --blacklist.

I was whitelisting all removed partitions and then removing them from the whitelist individually. Same idea I guess.

I'm going to analyze the SPS image (which, luckily, is available on the supermicro website), I'll keep you updated.

Awesome!

lasalvavida commented 6 years ago

From the Positive Technologies blog post on HAP disable:

We also checked the firmware of server and mobile versions of ME (SPS 4.x and TXE 3.x). In the server version, this flag is always set to 1; in the mobile version, it is ignored. This means that this method will not work in server and mobile versions (Apollo Lake) of ME.

So it would seem that soft disable not working is expected.

lasalvavida commented 6 years ago

Update from playing with some of the undocumented bits in the soft strap at 0x14 for disabling various components on the C3000 chipset:

image

Also saw this IE Disable bit:

image

And decided to try bit 13 at 0x78 to match the spacing with the ME SMBus Management soft strap:

image

Board booted normally, but no change of the ME status.

lasalvavida commented 6 years ago

Looking at this Power Management Controller register:

screenshot from 2018-04-11 18 01 51

It seems to mirror the ordering of the listed soft strap bits, and lines up with the observed behavior of the undocumented bits (i.e. Bit 13 disables USB2). So if that was going to work, it would have been bit 15, which had no effect. I did try it again just to be sure.

jjurkus commented 6 years ago

Oh great, so these newer C3000 motherboards, which seem so great for pfSense aren't all that great.

I have the A1SRi-2558F, which does not have intel ME. Perhaps it also lacks that "innovation engine", that sounds so incredibly helpful.

lasalvavida commented 6 years ago

A few updates:

There is a second $FPT Partition at 0x1000 identical to the one at 0x10. It doesn't seem to be used, maybe this is for recovery mode? Either way, we probably want to trim this as well. I am already doing this locally just to be sure that it wasn't related to the following observed behavior.

Previously, I wasn't able to boot an image even with all partitions whitelisted. I have been able to boot an image now after I disabled the EFFS related changes and the auto-removing of empty modules.

One very odd thing came up that I'm not really sure what to make of yet. The partitions in this image are as follows: FTPR, FTUP, DLMP, PSVN, IVBP, MFS, ROMB, FPTB, MFSB, IVB1, IVB2, BIS, FLOG, UTOK, and OPR1.

PSVN, IVBP, and ROMB are empty. I discovered that you can boot with any one or two of them removed, but not all three.

Then, I tried removing FLOG and UTOK which are not empty, and the board booted, but removing any third module causes the board not to boot.

It would seem that I can't remove more than two modules, but I'm not really sure why yet. @corna, any insight you have would be appreciated.

jjurkus commented 6 years ago

I've read somewhere SPS firmware images have a main and recovery image together, so if you find two $FPT partitions that would make sense.

I think I read it on the win-raid forum. Check this topic for a start: https://www.win-raid.com/t596f39-Intel-Management-Engine-Drivers-Firmware-amp-System-Tools.html

jjurkus commented 6 years ago

And of course you have also found ME Analyzer?

Other thing: I've looked in the manual. Have you tried to set the ME to manufacturing mode? Set JPME2 to bridge pins 2-3. (1-2 are bridged by default) See #195

lasalvavida commented 6 years ago

Other thing: I've looked in the manual. Have you tried to set the ME to manufacturing mode? Set JPME2 to bridge pins 2-3. (1-2 are bridged by default)

I think this jumper is actually mislabeled. The HECI firmware status always has bit 4 (manufacturing mode) set to 1 in both jumper positions. The jumper appears to toggle the ME between operational and recovery modes.

corna commented 6 years ago

I'm quite busy these days, but I haven't forgotten about this issue, don't worry. ;)

I should have some spare time this weekend, I'll work on it.

lasalvavida commented 6 years ago

:bell: Ding, dong, the witch is dead! (I think) :bell:

I was able to work around not being able to remove more than two modules by changing the offset and length of a module to zero to remove it instead of removing the table entry entirely.

You must leave MFS or the no power behavior that I described earlier occurs, FPTB which points to the recovery $FPT (you also need to make these changes there as well, or the ME will drop to recovery and continue functioning), and BIS or the board will not POST.

Board reports firmware version: 0.0.0.0 and recovery mode, which I understand is usually a good sign that this worked. The firmware heartbeat bits of HECI1_GS_SHDW1 are no longer incrementing.

Booted ArchLinux, board has been up for 30 minutes, so no issues with the watchdog timer either.

Happy to contribute my code for nulling out table entries instead of removing them if you think it's useful.

corna commented 6 years ago

Good job!

I've looked into the SPS firmware, here you can find the raw content; as you can see the interesting partitions are FTPR, FTUP and OPR1 (plus MFS).

So my hypothesis is (no way to verify it, so I may be completely wrong):

This scheme allows a good redundancy, however the FPT is still a single point of failure, so they added a second one (FTPB) to a fixed address (0x2000, so that you have one FPT at 0x0-0x1000 and another one at 0x1000-0x2000).

Note that, at least in ME 8, there was probably a backup FPT in the ROM (as we were able to completely wipe it without any effect). According to Youness, the ROM size has been reduced in Skylake, so it makes sense that they've moved the backup FPT out of it to save some space.

The ROMB (ROM Bypass) partition is used only in pre-production images, so it makes sense that it's empty.


Now, let's move to the part "what can we do?". Which partitions have you removed? Which ones are still there?

skochinsky commented 6 years ago

some additional comments:

OPR is a reference to Operational Region (term used in some Intel docs on ME). IIRC older SPS firmwares had OPR1 and OPR2 regions. The SPS firmware is distributed to OEMs as two binaries: spsRecovery.bin and spsOperational.bin. spsRecovery.bin contains an FPT while spsOperational.bin starts directly with a $CPD header without $FPT.

UTOK seems to be "Unlock Token", used to enable debug/diagnostic functionality on production firmware (see Inside Intel Management Engine for more info)

lasalvavida commented 6 years ago

Just wanted to say that I didn't forget about this; things have just been a little crazy on my end. I will try to make time to put together a pull request sometime this week.

edit: Unfortunately, this got away from me a little bit and I just haven't had the time. I will do my best to put something together before the end of July.

edit 2: No longer actively working on this, but hopefully my comments here help someone in the future.

felixsinger commented 5 years ago

@lasalvavida I am interested in this mainboard and I would like to know if BootGuard is enabled. If not, I will buy one myself and port coreboot on it.

Could you please check this? Would be very appreciated :)

Just do the following steps:

  1. git clone https://review.coreboot.org/coreboot
  2. cd coreboot/util/intelmetool && make
  3. Enable msr kernel module as it is needed for reading the specific registers sudo modprobe msr
  4. sudo ./intelmetool -b

Also, please attach a dump of lspci -nnk.

lasalvavida commented 5 years ago

Hi @felixsinger. Unfortunately, I no longer have access to this mainboard.

I can tell you that it does not have bootguard since the CPU is an Intel® Atom™ Processor C3338 which does not have bootguard.