beanit / asn1bean

ASN1bean (formerly known as jASN1) is a Java ASN.1 BER and DER encoding/decoding library
https://www.beanit.com/asn1/
Apache License 2.0
110 stars 45 forks source link

Handling of EXPLICIT ANY DEFINED BY ... #24

Closed mibollma closed 5 years ago

mibollma commented 5 years ago

When trying to parse cryptographic message syntax (https://tools.ietf.org/html/rfc3852) I'm unable to parse the actual content at places where the ANY type is used. Instead I only get the raw bytes.

A simplified example looks like this:

CMS DEFINITIONS IMPLICIT TAGS ::=
BEGIN
  ContentInfo ::= SEQUENCE {
        contentType ContentType,
        content [0] EXPLICIT ANY DEFINED BY contentType }

  ContentType ::= OBJECT IDENTIFIER

  id-signedData OBJECT IDENTIFIER ::= { iso(1) member-body(2)
         us(840) rsadsi(113549) pkcs(1) pkcs7(7) 2 }

  SignedData ::= SEQUENCE {
        version CMSVersion,
        ... }

  CMSVersion ::= INTEGER
        { v0(0), v1(1), v2(2), v3(3), v4(4), v5(5) }
END

When trying to replace ANY with the actual expected type such as in

CMS DEFINITIONS IMPLICIT TAGS ::=
BEGIN
  ContentInfo ::= SEQUENCE {
        contentType ContentType,
        content [0] EXPLICIT SignedData }

  ContentType ::= OBJECT IDENTIFIER

  id-signedData OBJECT IDENTIFIER ::= { iso(1) member-body(2)
         us(840) rsadsi(113549) pkcs(1) pkcs7(7) 2 }

  SignedData ::= SEQUENCE {
        version CMSVersion,
        ... }

  CMSVersion ::= INTEGER
        { v0(0), v1(1), v2(2), v3(3), v4(4), v5(5) }    
END

I receive an exception during decoding: java.io.IOException: Unexpected end of sequence, length tag: 20949315, actual sequence length: 3

sfeuerhahn commented 5 years ago

May be ANY is not always of type SignedData? You may have to get the raw data of ANY first and then decode that data based on the content type.

This is probably not a bug in jASN1.

mibollma commented 5 years ago

I guess I'm not experienced enough with ASN.1 to tell if thats a bug or expected behaviour. You are right, the standard allows multiple types at this position, however in my application I am always expecting the SignedData type within the files I receive.

Getting the raw data first sounds reasonable, however the raw data of the whole file is larger than the memory available on the target device. In any case I am not trying to read the whole file but to retrieve a few metadata elements buried within the ASN.1 file, things like AES initialization vectors, certificate properties and similar.

I assume that JASN1 does not have any way to stream the raw data and interpret it again in a second step as you suggested right?

sfeuerhahn commented 5 years ago

Right, it is not possible. When decoding a message the whole message needs to fit into memory.

mibollma commented 5 years ago

My original idea was to place the extension marker (...) within the specification to leave out large parts of the file I'm not interested in and to avoid this memory issue, but no luck so far.