linuxboot / heads

A minimal Linux that runs as a coreboot or LinuxBoot ROM payload to provide a secure, flexible boot environment for laptops, workstations and servers.
https://osresearch.net/
GNU General Public License v2.0
1.41k stars 185 forks source link

Inconsistent TOTP/HOTP unseal on T420 (affects other xx20 boards owners?) #1004

Closed akfhasodh closed 2 years ago

akfhasodh commented 3 years ago

Heads has been working well without any flaws for a few days now, up until I decided to update dom0 in qubes. I rebooted after the update, at which everything was fine, reverified the checksums and booted up. Upon the second boot after the update, the error screen for a missing TOTP/HOTP code thingy popped up. I generated a new one but now Im wondering if I was compromised. Any help would be appreciated.

akfhasodh commented 3 years ago

FYI my internal clock does not seem to have been bugging

tlaurion commented 3 years ago

@akfhasodh : QubesOS dom0 upgrade affects /boot component in most cases, where QubesOS might also update only binaries and libraries which are not linked to /boot.

For Heads considerations, only /boot components (Xen, kernel, initrd and grub file) are considered upon detached signed digest validation, that is, to verify that on the fly generation of sha256sum of /boot content matches what was detached signed with the use of a USB Security dongle GPG functions, resulting into /boot/kexec.sig file. If the generated sha256sum digest doesn't match detached signed result under /boot/kexec.sig, then Heads will show the differences between the digest copy under /boot/kexec_hashes.txt for files that have been deleted or modified, not added. Heads is interested in protecting grub config file changes (options being different at boot time), and known good configuration.

Now, to address your question on TOTP/HOTP missing, those are two different things.

Now from the words you used in OP, you seem to be talking about HOTP warning saying to you to plug in the USB dongle. You should never have to regenerate HOTP/TOTP whoich you seem to have done, which flushes the past measurements.

If no firmware changes have occured, you should have the same TPMTOTP secret being generated inside of a Qr code. IF you still have the old one plus this new one (each OTP app are different here, some will refuse to add a name that is the same as something already existing) you should have the same TOTP code generated for the same timestamp.

Basically, please clarify what you mean by a "missing TOTP/HOTP code thingy popped up". Tampering of the firmware would be impossible AFAIK from dom0, unless iomem=relaxed is passed to your grub.conf (which you can see default boot options that are signed under /boot/kexec_default*.txt files (those options are verified against signed digest as explained above). So

In short:

Reflashing the same maximized firmware, asking Heads to keep settings (public key and config changes overlay (/etc/config.user if any) should result in the same TOTP/HOTP measurements, which should not even raise "unable to unseal TOTP secret". Let me know if you need more clarifications.

akfhasodh commented 3 years ago

Clarifications: The HOTP and TOTP were both unable to be unsealed. Even after generating new secrets, it seems to be unable to unseal thoes as well. /boot is working I believe. Also, the unseal errors only began happening after I updated the qubes boot checksums.

tlaurion commented 3 years ago

@akfhasodh I understand you are using maximized hotp build for t420 from other post.

I would

Document errors happening along the way if any.

akfhasodh commented 3 years ago

@tlaurion Ive carried out those steps and its still happening. But ive found that the totp/hotp secrets dont just go away, after rebooting a few times i can usually get a boot screen in which everything goes correctly. So it seems to be some kind of consistency issue. Maybe tpm is glitching?

tlaurion commented 3 years ago

But ive found that the totp/hotp secrets dont just go away

@akfhasodh can you elaborate on this? Secrets are generated from measurements here (and are consistent). That secret is used to generate TOTP (with time being the variable here) and to do a handshake with HOTP to validate it through "remote attestation".

Can you give a screenshot or exact output next time it happens so we can have a trace of the behavior? TPM is rate limiting access, so maybe the glitch experienced here is somehow related to too many failed interactions, which locks the TPM for a time frame. I experienced that at default boot with TPM Disk Unlock key passphrase a couple of times before.

Should probably be documented.

akfhasodh commented 3 years ago

@tlaurion IMG_20210721_152817 IMG_20210721_152839

akfhasodh commented 3 years ago

IMG_20210721_152911

tlaurion commented 3 years ago

@akfhasodh and you say that state is transient, and resolves itself after a couple of reboots ?

akfhasodh commented 3 years ago

@tlaurion no, it used to be like that but now the ratio of good boots to bad boots is overwhelmingly bad.

tlaurion commented 3 years ago

@tlaurion IMG_20210721_152817

This happens here: https://github.com/osresearch/heads/blob/055165d61a3c312e7e03998ea832447452a86d71/initrd/bin/gui-init#L195-L209

Which means totp-unseal was unable to unseal secret, consistently to error. But that doesn't give an explanation on why.

IMG_20210721_152839

This gives more information with the actual reported error being : "Error PCR Mismatch from TPM_Unseal" (#780?).

This makes me wonder if something is different from 4.8.1 coreboot code between x230 and t420 (and x220). Might as well invest energy with you into testing coreboot 4.13 on the t420 instead of this old 4.8.1 coreboot with patches in. If MRC or any other part measured varied, then the result of measurements stored into TPM will also be different, resulting into the error encountered here.

@akfhasodh Would you willing to do that instead of trying to revive what should become dead soon? We missed x220 and t420. I could revive my old branch on top of master so you can test the t420-hotp-maximized board based on coreboot 4.13?

The long goal is to move away all Lenovo boards to 4.13, where I tested the x230 myself for a while now. But it doesn't make sense to move away only certain boards and not others to 4.13.

akfhasodh commented 3 years ago

@tlaurion Im willing to do that.

tlaurion commented 3 years ago

@akfhasodh so the discussion should go to #998

tlaurion commented 3 years ago

There might be a possibility that xx20 images have a problem with CBFS region specified. A pull request needing testing for all community based platforms is living under #1015

akfhasodh commented 3 years ago

@tlaurion In the past, I have attempted soldering on this motherboard, there is still some left, it was to add wires to easily reflash the bios. Could this be the root of the problem?

tlaurion commented 3 years ago

@akfhasodh

In the past, I have attempted soldering on this motherboard, there is still some left, it was to add wires to easily reflash the bios. Could this be the root of the problem?

It shouldn't, and wouldn't explain why payload gets different measurements across reboots... I would love to hear other tagged xx20 board owners: x220 (xx20): @techge @eganonoa @shamen123 @Thrilleratplay @BlackMaria t420 (xx20): @alexmaloteaux @natterangell @akfhasodh

Do you guys have similar behavior testing #1015 roms? @akfhasodh here is having inconsistencies with payload measurement with both coreboot 4.8.1 and coreboot 4.13 boards, and my only hypothesis as of now is that defined CONFIG_CBFS_SIZE for t420 and maybe x220 is off?

natterangell commented 3 years ago

I commented on https://github.com/osresearch/heads/pull/1015, but for clarity repeating that I'm not seeing this issue on the 4.13 t420-hotp-maximized rom. I've rebooted about a dozen times after flashing so far.

natterangell commented 3 years ago

Installed dom0 updates today and re-signed checksums. TOTP/HOTP still works as usual.

tlaurion commented 2 years ago

To be closed when #1015 is merged

tlaurion commented 2 years ago

1015 merged