intel / ehsm

An End-to-End Distributed and Scalable Cloud KMS (Key Management System) built on top of Intel SGX enclave-based HSM (Hardware Security Module), aka eHSM.
https://community.intel.com/t5/Blogs/Tech-Innovation/open-intel/An-Intel-SGX-based-Hardware-Security-Module-backed-Key/post/1360130?wapkw=eHSM
BSD 3-Clause "New" or "Revised" License
167 stars 52 forks source link

Bug: padding is returned as part of plaintext in the SM4_CBC decryption #311

Closed truc0 closed 1 year ago

truc0 commented 1 year ago

Description

Padding of data is also returned as part of data in SM4_CBC decryption, which causes the inconsistence between plaintext and result of decryption.

Reproduction

The following script is a minimal reproduction. Save it to bug.py in the ehsm/test folder and run.

from cli import createkey, enroll, decrypt, encrypt, enroll
import base64
import _utils_

# !!! CHANGE ME IF NEEDED
BASE_URL = 'https://127.0.0.1:9002/ehsm?Action='

appid, apikey = '', ''
appid, apikey = enroll.enroll(BASE_URL)
_utils_.init_appid_apikey(appid, apikey)

keyid = createkey.createkey(BASE_URL, "EH_SM4_CBC", "EH_INTERNAL_KEY", "EH_KEYUSAGE_ENCRYPT_DECRYPT")
text = "symmetricKey"

aad = str(base64.b64encode("test".encode("utf-8")),'utf-8')
data = str(base64.b64encode(text.encode("utf-8")),'utf-8')

# encrypt
ciphertext = encrypt.encrypt(BASE_URL, keyid, data, aad)

# test Decrypt(ciphertext)
plaintext = decrypt.decrypt(BASE_URL, keyid, ciphertext, aad)

try:
    assert plaintext == text
    print("\033[32m", "PASS", "\033[0m", sep='')
except AssertionError:
    print("\033[31m", "FAIL", "\033[0m", sep='')
    print("decrypted:")
    print(plaintext.encode())
    print("origin:")
    print(text.encode())

The result should be:

<output of the cli>
FAIL
decrypted:
b'symmetricKey\x04\x04\x04\x04'
origin:
b'symmetricKey'

Note

Note that the padding only occurs when len(text) % 8 != 0.

truc0 commented 1 year ago

Link to PR #312

truc0 commented 1 year ago

This issue may be related to the padding process in encryption. Here are some of my thoughts, feel free to point out my faults.

The Problem

The problem maybe related to wrong padding strategy in encryption process.

The RFC5652 (aka PKCS#7) indicates that padding should always apply to the plaintext before encryption. Current implementation of SM4_CBC encryption disables padding when len(plaintext) % 16 == 0 (see ehsm/core/Enclave/openssl_operation.cpp#L293).

https://github.com/intel/ehsm/blob/926fd308016119293ebc1231f8f284f46b5a8844/core/Enclave/openssl_operation.cpp#L282-L294

Why always enable padding

Disabling padding mode when len(plaintext) % 16 results in ambiguity of the decrypted data (before removing padding) if the plaintext before encryption is as following:

00 01 02 03 04 05 06 07
08 09 0a 0b 04 04 04 04

The last 4 bytes is exact the same as the padding of the following plaintext:

00 01 02 03 04 05 06 07
08 09 0a 0b

To solve this problem, PKCS#7 enables padding even if len(plaintext) % k == 0 (k = 16 in this case). The plaintext with padding for the first example above should be:

00 01 02 03 04 05 06 07
08 09 0a 0b 04 04 04 04
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10

The modification makes the decryption process clearer and guarantees the data is consistence after encryption and decryption.

Possible Solution

Always enable padding mode in encryption should solve this issue.