Closed balena closed 7 years ago
By default openssl smime
expects a MIME message. I would try adding -inform der
and using the raw data from asn1crypto's dump() method. That should rule out any issue with the DER data being wrapped.
Also, you shouldn't need to add the tag
or tag_type
kwargs since these are known structures, although I am not at my computer right now to test.
Dear Will,
Actually the tag
and and tag_type
parameters were added as an attempt to overcome the OpenSSL parsing issue. If I remove them, the OpenSSL error message is still the same.
And not, -inform
does not make any sense for the command openssl smime
as the file tmp
indicated in the command line has to be in MIME-1.0 format, as follows:
To: "Alice" <alice@foo.com>
From: "Bob" <bob@bar.com>
Subject: A message from python
MIME-Version: 1.0
Content-Type: application/pkcs7-mime; smime-type=enveloped-data; name=smime.p7m
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=smime.p7m
MII...base64-encoded-CMS-ContentInfo...
The man page for openssl mime
says you can pass a raw, DER-encoded PKCS#7 structure using openssl -inform der -in tmp
. This would at least rule out any encoding issues adding the problem, although my guess is that is unlikely.
It is certainly possible there is an error in one of the CMS structure definitions. I've used some via oscrypto because .p12
files use PKCS#7 to encrypt key data and certificates, but I'm sure there are some variations that haven't been hit before.
What may be helpful in debugging is to pass a known good file into asn1crypto and use the .debug()
function to dump the complete structure and all of the encoding details. Comparing this to what is being constructed in Python would probably be one of the fastest paths to figuring it out.
Actually, looking at your traceback more closely and making an educated guess:
Error reading S/MIME message
140702704453280:error:0D0680A8:asn1 encoding routines:ASN1_CHECK_TLEN:wrong tag:tasn_dec.c:1338:
140702704453280:error:0D06C03A:asn1 encoding routines:ASN1_D2I_EX_PRIMITIVE:nested asn1 error:tasn_dec.c:852:
140702704453280:error:0D08303A:asn1 encoding routines:ASN1_TEMPLATE_NOEXP_D2I:nested asn1 error:tasn_dec.c:772:Field=type, Type=PKCS7
140702704453280:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:193:
140702704453280:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:528:
I think it is having trouble decoding the type
field of a PKCS7
structure. That is likely the content_type
field of cms.ContentInfo()
. This would be the first field of the first structure, which makes me think something is funky in terms of one of the encoding steps after DER.
The only other comment is that once you figure out what is going on here, you can omit many of the explicit class constructors in constructing the cms.ContentInfo()
object. If you pass nested dict
/list
objects with appropriate keys, it will figure it out from there since it already has all of the expected field types encoded.
Dear William,
Thanks for your suggestions, and you were correct: the problem was indeed in the ContentInfo
field with wrong name ("contentType"
instead of "content_type"
).
Some improvements would make implementors' life easier:
core.py:424
, the debug
function uses %s
. When debugging CMS output, it often raises an exception when trying to print binary values contained in OctetString
s (such as in EncryptedContentInfo
, field encrypted_content
). I suggest you to use %r
instead.Choice
implementation does not permit a "pythonistic" construction (dict
/list
). As fair as I have understood, it requires always explicit class constructor with name
and value
parameters. As suggestion, you may check if the name is a dict, if so, it should have a single attribute, which the key corresponds to the name
parameter and the value corresponds to the value
parameter.pyasn1
, or when reading directly from ASN.1 specifications.Choice
ASN.1 construct is strange because it allows any number of different values to be represented as long as they have distinct tags, so the parsing side can disambiguate. To construct one "pythonically", you would use a dict with name
and value
keys: Choice({'name': 'the_choice_name', 'value': {'field': 'val'}})
.**kwargs
, so I will work on improving that.Traceback (most recent call last):
File "asn1cmp.py", line 58, in <module>
d1.debug()
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 3289, in debug
child.debug(nest_level + 3)
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 3289, in debug
child.debug(nest_level + 3)
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 3747, in debug
child.debug(nest_level + 1)
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 422, in debug
self.chosen.debug(nest_level + 2)
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 3289, in debug
child.debug(nest_level + 3)
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 424, in debug
print('%s Native: %s' % (prefix, self.native))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x94 in position 0: ordinal not in range(128)
The above happens for an EncryptedContentInfo
, with "encrypted_key"
containing 0x94264288c68bb67820ff82572a4198d168083a04a7cee486d8327dfd8de129aa0f74713f1fbc85b2d0b7d8b761c33071aa7e6a23a4caa8cc6a54f4dd5502d1704e1aff87608dc70599bf685cf1a2af98397016b39c6519ead9c1339cf63e806252b8a97be7ced479490f13063973cad9b857bce0f931ab9c654d401370e3b130
(this is not unicode neither ascii; this is an encrypted content).
return Container({"something": WhateverChoice(name="foo", value="bar")})
the same would accept:
return Container({"something": {"foo": "bar"}})
Hi,
I'm having a similar issue. OctetString
doesn't like to be .debug()
'd. Here's a simplified example.
def test_octect_string():
import asn1crypto.core
from binascii import unhexlify
der_octetstring = unhexlify("0402CCE9")
a = asn1crypto.core.OctetString.load(der_octetstring)
print("!")
print(a.dump())
#print(a.debug())
b = asn1crypto.core.OctetString(unhexlify("CCE9"))
assert b.dump() == a.dump()
#b.debug()
This code works fine as-is, but if you uncomment any of the .debug
calls then it explodes with:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 2: ordinal not in range(128)
Now, I'm not sure if I'm using OctentString here correctly, but unfortunately I'm working with an ASN.1 schema and data source that I don't control, and it's using it like this:
KeyBlock ::= Sequence
{
KeyBlock version = 1 (INTEGER)
Keys ::= Set
{
KeyInfo ::= Sequence
{
TR31Key = (PRINTABLESTRING)
keyType = (INTEGER)
ksn = (OCTET STRING)
keySlot = (INTEGER)
keyName = (PRINTABLESTRING)
keyCheck = (OCTET STRING)
}
}
}
with 0xCCE9
for keyCheck
, which causes problems on debug()
.
(Though this guide says OctectStrings are any 8bit data values, not necessarily ascii-printable ones)
My guess is that you are using Python 2 and your stdout is ASCII or something of the like. A full backtrace would help in confirming (I'm away from my computer for the week).
Yes, unfortunately this project is stuck using python2 :(
I didn't post the backtrace before as it's not much use, and the faulting line is the same as @balena found.
Using the code from my last post (minus the function definition) results in:
$ python2 temp.py
!
��
asn1crypto.core.OctetString Object #139984550301200
Header: 0x0402
primitive universal tag 4
Data: 0xcce9
Traceback (most recent call last):
File "temp.py", line 9, in <module>
print(a.debug())
File "/usr/local/lib/python2.7/dist-packages/asn1crypto/core.py", line 424, in debug
print('%s Native: %s' % (prefix, self.native))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcc in position 0: ordinal not in range(128)
(ps: I've also now seen this issue when using .debug()
on a certificate file originating in the same place as this OctetString )
As of afd3cac4b33d56aaf8ddcce2c333b60828350c3d, Python 2 should no longer raise an exception when debugging a core.OctetString
.
As of 2260ee300972ae3e0cd7d31710d21432466773e0, core.Sequence
will raise a ValueError
when an invalid field is specified.
@balena I implemented your suggestion of being able to pass a dict
to a core.Choice
constructor, but it felt like trying to pound a square peg through a round hole. I ended up stashing it in case I want to revisit later. What about being able to pass a 2-element tuple of (name, value)
?
@wbond, regarding the dict
as a core.Choice
, you may have felt bad with it, but it is actually the most natural way...
Let's focus on an example:
# Current construction:
'recipient_infos': [
cms.RecipientInfo(
name='ktri',
value={
'version': u'v0',
'rid': cms.RecipientIdentifier(
name='issuer_and_serial_number',
value=issuer
),
# ...
}
)
]
I see a core.Choice
as a special form of core.Sequence
, where a single "attribute" exists (the attribute name defines the type of the value). Maybe the possible confusion is that when creating a dict
for a core.Choice
the developer may end up specifying more than one attribute, which in such a case you may raise an exception.
# Using dict:
'recipient_infos': [
{ 'ktri': {
'version': u'v0',
'rid': { 'issuer_and_serial_number': issuer },
# ...
}
]
The code below is not as fluent to me as the above, but it is certainly better than the original. Similarly, the developer may pass a tuple
with more than 2 elements, so it may be a good idea to prevent such situations raising an exception:
# Using tuple:
'recipient_infos': [
( 'ktri', {
'version': u'v0',
'rid': ( 'issuer_and_serial_number', issuer ),
# ...
)
]
Whatever way you choose will be much better than now.
I do understand what you mean about the dict seeming more fluent, but it feels wrong from a data perspective to use a data structure designed to contain multiple keys when only exactly one key can ever be used. I think this is why I proposed using a tuple since it is the natural data type for a fixed-length sequence.
Certainly both forms can be error checked.
Also, the current implementation can be used in the following way:
'recipient_infos': [
cms.RecipientInfo('ktri', {
'version': u'v0',
'rid': cms.RecipientIdentifier('issuer_and_serial_number', issuer),
# ...
})
]
The tuple version allows you to just omit the cms.RecipientInfo
syntax.
... omit the cms.RecipientInfo
and cms.RecipientIdentifier
syntax.
Any of dict
or tuple
will certainly work pretty fine. When making such language mappings, there's never a single way, maybe not a better way...
All I still think is that dict
is more natural to type than tuple
in this case (a programmer will almost certainly try dict
first, get an exception and then use a tuple
).
@balena I ended up implementing support for both a single-element dict
and a two-element tuple
in 9d6ca0c8b7651b69797fa9c1a564f0999f519fa1.
Looks great! Thanks @wbond
I am trying to create a CMS Enveloped Data using asn1crypto as a substitution of the Google's certificate-transparency source code I forked on my project https://github.com/balena/python-smime. I had the idea of using asn1crypto after having some trouble when parsing indefinite-length attributes generated by certain E-Mail managers (like Thunderbird).
While asn1crypto is great for parsing all kinds of CMS data, I am getting some trouble when generating them. I have an unit test where the PKCS7 output is passed as input to command-line OpenSSL in order to make sure the implementation is compatible. But all I get is an error from OpenSSL when parsing ASN.1 tags (I don't know exactly which one).
There is a single function responsible for returning a CMS ContentInfo structure as follows:
The object returned by the above function is encoded using:
And the __encode_in_base64 is just a pretty printer function to convert the DER output into BASE64.
When I execute the OpenSSL function:
All I get is the following:
Do you have any idea what could be the problem? Is there any caveat when encoding asn1crypto structures like the above?