librasn / rasn

A Safe #[no_std] ASN.1 Codec Framework
Other
183 stars 43 forks source link

OER encoder produces output that it can't decode #258

Open pcwizz opened 3 weeks ago

pcwizz commented 3 weeks ago

I've been doing a spot of testing of the new OER implementation and found some inputs that produce encoder output that the OER decoder refuses to decode. @XAMPPRocky asked that I report them here to be fixed.

Example

Input

[4, 4]

Error

DecodeError { kind: CodecSpecific { inner: Oer(InvalidTagVariantOnChoice { value: Tag { class: Universal, value: 0 }, is_extensible: false }) }, codec: Oer }

Fuzz target

#![no_main]

use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    let smi = match rasn::oer::decode::<rasn_smi::v2::ObjectSyntax>(data) {
        Ok(v) => v,
        Err(_) => return,
    };
    let d2 = rasn::oer::encode(&smi).unwrap();
    let smi2 = rasn::oer::decode::<rasn_smi::v2::ObjectSyntax>(&d2).unwrap();
    assert_eq!(smi,smi2)
});

Fuzzer output

thread '<unnamed>' panicked at fuzz_targets/cmp_smi_v2_oer.rs:11:69:
called `Result::unwrap()` on an `Err` value: DecodeError { kind: CodecSpecific { inner: Oer(InvalidTagVariantOnChoice { value: Tag { class: Universal, value: 0 }, is_extensible: false }) }, codec: Oer }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
==87904== ERROR: libFuzzer: deadly signal
    #0 0x106945470 in __sanitizer_print_stack_trace+0x28 (librustc-nightly_rt.asan.dylib:arm64+0x59470)
    #1 0x10567de4c in fuzzer::PrintStackTrace()+0x30 (cmp_smi_v2_oer:arm64+0x10078de4c)
    #2 0x105670de0 in fuzzer::Fuzzer::CrashCallback()+0x54 (cmp_smi_v2_oer:arm64+0x100780de0)
    #3 0x18053f580 in _sigtramp+0x34 (libsystem_platform.dylib:arm64+0x4580)
    #4 0xfd4900018050ec1c  (<unknown module>)
    #5 0x8c4100018041ba1c  (<unknown module>)
    #6 0x432c0001056e9ee8  (<unknown module>)
    #7 0x105736438 in std::process::abort::h34c97040caf7df38+0x8 (cmp_smi_v2_oer:arm64+0x100846438)
    #8 0x10566fd10 in libfuzzer_sys::initialize::_$u7b$$u7b$closure$u7d$$u7d$::h7a53761f12b67db4+0xb8 (cmp_smi_v2_oer:arm64+0x10077fd10)
    #9 0x1056e0ec0 in std::panicking::rust_panic_with_hook::hd4efef7c95419c65+0x5c4 (cmp_smi_v2_oer:arm64+0x1007f0ec0)
    #10 0x1056e08c4 in std::panicking::begin_panic_handler::_$u7b$$u7b$closure$u7d$$u7d$::hb607843a3e5e990f+0x94 (cmp_smi_v2_oer:arm64+0x1007f08c4)
    #11 0x1056de504 in std::sys_common::backtrace::__rust_end_short_backtrace::h0428fbb24c431116+0x8 (cmp_smi_v2_oer:arm64+0x1007ee504)
    #12 0x1056e0634 in rust_begin_unwind+0x30 (cmp_smi_v2_oer:arm64+0x1007f0634)
    #13 0x105737fbc in core::panicking::panic_fmt::hba5d86399e74ef8b+0x28 (cmp_smi_v2_oer:arm64+0x100847fbc)
    #14 0x1057383bc in core::result::unwrap_failed::ha01d8d24a25c4c95+0x58 (cmp_smi_v2_oer:arm64+0x1008483bc)
    #15 0x104f3b3b0 in cmp_smi_v2_oer::_::__libfuzzer_sys_run::heb26ac6463c6b55e cmp_smi_v2_oer.rs:11
    #16 0x104f3a634 in rust_fuzzer_test_input lib.rs:224
    #17 0x10566a69c in std::panicking::try::do_call::h26ba1b79d651ec21+0xc4 (cmp_smi_v2_oer:arm64+0x10077a69c)
    #18 0x10566ff8c in __rust_try+0x20 (cmp_smi_v2_oer:arm64+0x10077ff8c)
    #19 0x10566f3c8 in LLVMFuzzerTestOneInput+0x16c (cmp_smi_v2_oer:arm64+0x10077f3c8)
    #20 0x1056726a4 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long)+0x150 (cmp_smi_v2_oer:arm64+0x1007826a4)
    #21 0x105671d34 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*)+0x48 (cmp_smi_v2_oer:arm64+0x100781d34)
    #22 0x105673ef4 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<fuzzer::SizedFile, std::__1::allocator<fuzzer::SizedFile>>&)+0x5d8 (cmp_smi_v2_oer:arm64+0x100783ef4)
    #23 0x105674290 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, std::__1::allocator<fuzzer::SizedFile>>&)+0xd0 (cmp_smi_v2_oer:arm64+0x100784290)
    #24 0x105695730 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long))+0x1d74 (cmp_smi_v2_oer:arm64+0x1007a5730)
    #25 0x1056a2fa8 in main+0x24 (cmp_smi_v2_oer:arm64+0x1007b2fa8)
    #26 0x1801860dc  (<unknown module>)
    #27 0x2b547ffffffffffc  (<unknown module>)

Output of `std::fmt::Debug`:

        [4, 4]
XAMPPRocky commented 3 weeks ago

Cc @Nicceboy

Nicceboy commented 2 weeks ago

Seems like this bug is triggered because the rasn_smi::v2::ObjectSyntax has nested untagged Choices. I am not sure how I should fix this, since there is no single truth for encoding and decoding in OER codec for this case. If I understand correctly, the first identified tag from the nested choices is repeated, but this is ambiguous on complex types.

While this should not happen in most cases on later standards, maybe something needs to be done. Should I just throw error or try to explore possible decode variations and choose the first suitable? On simple types it would work, but no guarantees for complex ones as far as I know.

XAMPPRocky commented 2 weeks ago

Are you sure there's nothing? That is a case that is called out in other standards. You might want to check both the OER and X680 standards.

Nicceboy commented 2 weeks ago

This was all I could find related to this:

From OER standard:

20.1 The encoding of a value of a choice type shall consist of the encoding of the outermost tag of the type of the chosen alternative as specified in 8.7, followed by the encoding of the value of the chosen alternative. NOTE 1 – Since the outermost tags of the alternatives of a choice type are required to be all different (see Rec. ITU-T X.680 | ISO/IEC 8824-1, 29.3), the outermost tag is sufficient to identify the chosen alternative. NOTE 2 – If the type of the chosen alternative has more than one tag as a result of explicit tagging, the tags following the outermost tag are not encoded. NOTE 3 – If the type of the choice alternative is an untagged choice type, the outermost tag for that alternative will appear more than once in the encoding. This is different from how BER works.

From X680

31.2.7 The tagging construction specifies explicit tagging if any of the following holds: a) the "Tag EXPLICIT Type" alternative is used; b) the "Tag Type" alternative is used and the value of "TagDefault" for the module is either EXPLICIT TAGS or is empty; c) the "Tag Type" alternative is used and the value of "TagDefault" for the module is IMPLICIT TAGS or AUTOMATIC TAGS, but the type defined by "Type" is an untagged choice type, an untagged open type, or an untagged "DummyReference" (see Rec. ITU-T X.683 | ISO/IEC 8824-4, Error! Reference source not found.).

Nicceboy commented 1 week ago

Here is an example of ambiguous case:

OuterChoice ::= CHOICE {
    a INTEGER,
    b InnerChoice
}

InnerChoice ::= CHOICE {
    c INTEGER,
    d BOOLEAN
}

In both cases the outermost encoded value would be the same, whether a was chosen from outermost choice or c was chosen from the innermost.

Decoder would need to identify the correct type/value selection by peeking next bytes or otherwise trying to understand the context, but this feels a bit too complex to add and does not guarantee correctness.

BER handles this by using nested TLV pattern, and PER handles this by using indexes. I guess OER does not handle this case, because it is aiming for simplicity and efficiency in general.

Rather, maybe it would be better to encourage here that ASN.1 type specifications should be designed to be less ambiguous with OER (e.g. use tags), instead of trying to implement something for now.

We could implement encoder (potentially), but not decoder. Or we could implement decoder in simplified form, where it tries to create map if the type contains nested choices and then looks for repeating same bytes for suitable inner type until it does not, but I am not sure how difficult it is to add. Type encoding itself could include identical bytes, so it might not be completely correct.

I could implement complete decoder, if someone shows me the section(s) in standard(s) which proves that this is not ambiguous.