vlm / asn1c

The ASN.1 Compiler
http://lionet.info/asn1c/
BSD 2-Clause "Simplified" License
1.03k stars 551 forks source link

Incorrect UPER encoding for empty extensible SEQUENCE in 38.331 #331

Open fdciotti opened 5 years ago

fdciotti commented 5 years ago

The searchSpaceType IE in 38.331f40 contains three OPTIONAL empty extensible SEQUENCEs (see below). When encoding asn_DEF_BCCH_DL_SCH_Message with one of these SEQUENCEs present (e.g.; dci-Format2-1), the correct preamble bit is set to indicate that the SEQUENCE is present, however a zero bit is not inserted to indicate that there is no extension.

I searched the existing issues and did not see this one posted. Have others run into this? I can work around it for now by changing the ASN.1 text, but obviously that is not a workable solution for future compatibility. I will look into resolving the issue if it is not already being worked on.

searchSpaceType                         CHOICE {
    common                                  SEQUENCE {
        dci-Format0-0-AndFormat1-0              SEQUENCE {
            ...
        }                                                                                                   OPTIONAL,   -- Need R
        dci-Format2-0                           SEQUENCE {
            nrofCandidates-SFI                      SEQUENCE {
                aggregationLevel1                       ENUMERATED {n1, n2}                                 OPTIONAL,   -- Need R
                aggregationLevel2                       ENUMERATED {n1, n2}                                 OPTIONAL,   -- Need R
                aggregationLevel4                       ENUMERATED {n1, n2}                                 OPTIONAL,   -- Need R
                aggregationLevel8                       ENUMERATED {n1, n2}                                 OPTIONAL,   -- Need R
                aggregationLevel16                      ENUMERATED {n1, n2}                                 OPTIONAL    -- Need R
            },
            ...
        }                                                                                                   OPTIONAL,   -- Need R
        dci-Format2-1                           SEQUENCE {
            ...
        }                                                                                                   OPTIONAL,   -- Need R
        dci-Format2-2                           SEQUENCE {
            ...
        }                                                                                                   OPTIONAL,   -- Need R
        dci-Format2-3                           SEQUENCE {
            dummy1                  ENUMERATED {sl1, sl2, sl4, sl5, sl8, sl10, sl16, sl20}                  OPTIONAL,   -- Cond Setup
            dummy2                  ENUMERATED {n1, n2},
            ...
        }                                                                                                   OPTIONAL    -- Need R
    },
    ue-Specific                             SEQUENCE {
        dci-Formats                             ENUMERATED {formats0-0-And-1-0, formats0-1-And-1-1},
        ...
    }
}                                                                                                           OPTIONAL    -- Cond Setup
velichkov commented 5 years ago

Hi @fdciotti,

Could you try with vlm_master branch from mouse07410's fork as it contains various fixes?

And could you provide the extracted ASN.1 definitions together with test UPER binaries or a Wireshark pcap file?

fdciotti commented 5 years ago

@velichkov , thanks for the reply. I will try the vlm_master branch and let you know if that resolves the issue.

I created a simple ASN.1 file to isolate the issue (see below). In this example, with sequence1 absent, and sequence2 and 3present and the 'dummy' byte set to 0xFF, the UPER encoded output is correct @20 bits with a value of 0x6F FF F0 (4 bits of padding). This shows that the 0 extension bit is added when the SEQUENCE has at least one type in addition to the extension field (sequence2).

However, when sequence1 is added, the size of the encoded output should increase to 21 bits to account for the extension with a value of 0xE7 FF F8 (3 bits of padding), but it remains at 20 bits with a value of 0xEF FF F0 (4 bits of padding). So the preamble bit is set correctly for sequence1 being present, but the remaining bits are not changed (0 extension bit not inserted for sequence1).

Regarding your request for the extracted ASN.1 definition, do you want all the .c & .h files generated by the asn1c compiler?

EMPTY-SEQUENCE-Definitions DEFINITIONS AUTOMATIC TAGS ::=

BEGIN

EmptySequence ::= SEQUENCE { sequence1 SEQUENCE { ... -- extension bit not present } OPTIONAL, sequence2 SEQUENCE { dummy OCTET STRING (SIZE (1)), -- results in extension bit being present ... } OPTIONAL, sequence3 SEQUENCE { dummy OCTET STRING (SIZE (1)) } OPTIONAL }

END

fdciotti commented 5 years ago

Hi @velichkov , I tried my above EMPTY-SEQUENCE ASN.1 with the vlm_master branch, and unfortunately, I get the same result - the extension bit is not being inserted for 'sequence1'. My work-around for now is to replace the '...' in empty SEQUENCEs of 38.331 with a 'BIT STRING (SIZE(1))' and set the bit to 0. This results in the correct encoding.

When I get some time, I will step into uper_encode_to_buffer() to see if I can determine where the failure is occurring, and suggest a fix, unless there is some other suggestion that I should try.

brchiu commented 5 years ago

@fdciotti, after quick glance at the generated code from your minimum reproducible example, the first_extension field of asn_SPC_sequence1_specs_2 set to -1 might be the cause that it skip some afterward processing of extension in SEQUENCE_encode_uper().

static asn_SEQUENCE_specifics_t asn_SPC_sequence1_specs_2 = {
    sizeof(struct sequence1),
    offsetof(struct sequence1, _asn_ctx),
    0,  /* No top level tags */
    0,  /* No tags in the map */
    0, 0, 0,    /* Optional elements (not needed) */
    -1, /* First extension addition */
};
static /* Use -fall-defs-global to expose */
asn_TYPE_descriptor_t asn_DEF_sequence1_2 = {
    "sequence1",
    "sequence1",
    &asn_OP_SEQUENCE,
    asn_DEF_sequence1_tags_2,
    sizeof(asn_DEF_sequence1_tags_2)
        /sizeof(asn_DEF_sequence1_tags_2[0]) - 1, /* 1 */
    asn_DEF_sequence1_tags_2,   /* Same as above */
    sizeof(asn_DEF_sequence1_tags_2)
        /sizeof(asn_DEF_sequence1_tags_2[0]), /* 2 */
    { 0, 0, SEQUENCE_constraint },
    0, 0,   /* No members */
    &asn_SPC_sequence1_specs_2  /* Additional specs */
};
asn_enc_rval_t
SEQUENCE_encode_uper(...)
{
....
    if(specs->first_extension < 0) {
        n_extensions = 0; /* There are no extensions to encode */
    } else {
        n_extensions = SEQUENCE__handle_extensions(td, sptr, 0, 0);
        if(n_extensions < 0) ASN__ENCODE_FAILED;
        if(per_put_few_bits(po, n_extensions ? 1 : 0, 1)) {
            ASN__ENCODE_FAILED;
        }
    }
....

Perhaps you can check this part. FYI.

brchiu commented 5 years ago

@fdciotti , function asn1c_lang_C_type_SEQUENCE_def() of asn1c_C.c determines this -1 value because expr_elements_count() return 0. However, I don't have enough knowledge of ASN.1 to deal with it correctly without break other things, so I only provide this code search result and leave it to you or others who can help.

fdciotti commented 5 years ago

@brchiu , Thanks for the guidance. I tried a quick hack of changing the 'First extension addition' value in asn_SPC_sequence1_specs_2 from -1 to 1 (matching the value in asn_SPC_sequence2_specs_4), but this resulted in a seg fault upon encoding. Clearly I need to spend some time looking into how asn_SPC_sequence1_specs_2 is being generated and processed.

brchiu commented 5 years ago

@fdciotti, in addition to the minimal asn, do you have encoded test data , even in hexadecimal string is OK. I guess the index might need be start from 0, instead of from 1.

The rationale is : There's a normal field in your second sequence2 so it counts the position of extension field to 1. But for first sequence1 , there is no such thing existing so it should count it's position to 0.

What I am not acquainted with is when to encode this extension and when not encode this extension.

velichkov commented 5 years ago

@brchiu,

do you have encoded test data , even in hexadecimal string is OK.

@fdciotti already provided binaries in https://github.com/vlm/asn1c/issues/331#issuecomment-485203364 Here are some commands for easy testing

$ echo "6F FF F0" | xxd -r -p | ./converter-example -iper -
<EmptySequence>
    <sequence2>
        <dummy>FF</dummy>
    </sequence2>
    <sequence3>
        <dummy>FF</dummy>
    </sequence3>
</EmptySequence>

$ echo "E7 FF F8" | xxd -r -p | ./converter-example -iper -
<EmptySequence>
    <sequence1>
    </sequence1>
    <sequence2>
        <dummy>7F</dummy>
    </sequence2>
    <sequence3>
        <dummy>FF</dummy>
    </sequence3>
</EmptySequence>

$ echo "EF FF F0" | xxd -r -p | ./converter-example -iper -
<EmptySequence>
    <sequence1>
    </sequence1>
    <sequence2>
        <dummy>FF</dummy>
    </sequence2>
    <sequence3>
        <dummy>FF</dummy>
    </sequence3>
</EmptySequence>

$ cat message.xer 
<EmptySequence>
    <sequence1>
    </sequence1>
    <sequence2>
        <dummy>FF</dummy>
    </sequence2>
    <sequence3>
        <dummy>FF</dummy>
    </sequence3>
</EmptySequence>

$ ./converter-example -ixer message.xer -oper | xxd -p
effff0                 
brchiu commented 5 years ago

@velichkov & @fdciotti, thank you and sorry for my overlook.

velichkov commented 5 years ago

@brchiu, thanks for checking this.

What I am not acquainted with is when to encode this extension and when not encode this extension.

From Rec. ITU-T X.691 (08/2015)

19.1 If the sequence type has an extension marker in the "ComponentTypeLists" or in the "SequenceType" productions, then a single bit shall first be added to the field-list in a bit-field of length one. The bit shall be one if values of extension additions are present in this encoding, and zero otherwise. (This bit is called the "extension bit" in the following text.) If there is no extension marker in the "ComponentTypeLists" or in the "SequenceType" productions, there shall be no extension bit added.

19.6 This completes the encoding if the extension bit is absent or is zero. If the extension bit is present and set to one, then the following procedures apply. 19.7 Let the number of extension additions in the type being encoded be "n", then a bit-field with "n" bits shall be produced for addition to the field-list. The bits of the bit-field shall, taken in order, encode the presence or absence of an encoding of each extension addition in the type being encoded. A bit value of 1 shall encode the presence of the encoding of the extension addition, and a bit value of 0 shall encode the absence of the encoding of the extension addition. The leading bit in the bit-field shall encode the presence or absence of the first extension addition, and the trailing bit shall encode the presence or absence of the last extension addition.

I'm also trying to find a solution but without any success so far. There are problems with extensible sequences without any elements in BER/DER and XER encodings as well, here are some examples.

$ echo "<EmptySequence><sequence2><dummy>ff</dummy><dummy1>ff</dummy1></sequence2></EmptySequence>" | ./converter-example -ixer -
<EmptySequence>
    <sequence2>
        <dummy>FF</dummy>
    </sequence2>
</EmptySequence>

That's correct as sequence2 is extensible and dummy1 is not present in the ASN.1 so it gets skipped

$ echo "3008a1068001ff8101ff" | xxd -r -p | ./converter-example -iber -
<EmptySequence>
    <sequence2>
        <dummy>FF</dummy>
    </sequence2>
</EmptySequence>

This binary contains one unknown extension 8101ff that gets skipped.

$ echo "<EmptySequence><sequence1><dummy>ff</dummy></sequence1></EmptySequence>" | ./converter-example -ixer -
standard input: Decode failed past byte 26: Input processing error

$ echo "3005a0038001ff" | xxd -r -p | ./converter-example -iber -
standard input: Decode failed past byte 2: Input processing error

That's not correct as sequence1 is extensible.

$ echo "<EmptySequence><sequence3><dummy>ff</dummy><dummy1>ff</dummy1></sequence3></EmptySequence>" | ./converter-example -ixer -
standard input: Decode failed past byte 43: Input processing error

$ echo "3008a2068001ff8101ff" | xxd -r -p | ./converter-example -iber -
standard input: Decode failed past byte 2: Input processing error

That's correct as sequence3 is not extensible.

fdciotti commented 5 years ago

Hello @velichkov and @brchiu , thanks for continuing to help me investigate this issue.

@velichkov , I see from your example of 'echo "E7 FF F8" | xxd -r -p | ./converter-example -iper -' that the decode of 'dummy' for 'sequence2' is 0x7F instead of 0xFF, due to the 0 extension bit erroneously being included as the MSB of the value.

Do you think that the issue is the population of incorrect values in asn_SPC_sequence1_specs_2? Or is it in the processing of asn_SPC_sequence1_specs_2 during encode/decode? Or is it both? I'm just trying to determine where to focus my efforts when searching for the bug. If asn_SPC_sequence1_specs_2 is simply being populated incorrectly, that can easily be checked by manually correcting the values and seeing if the encode function generates the correct hex string.

brchiu commented 5 years ago

@velichkov & @fdciotti , I have a quick tweak https://github.com/brchiu/asn1c/commit/b7a3f5878568e4f8949ff4458faa6716d478ff68 attempting to fix it. You can cherry-pick it to your own branch.

velichkov commented 5 years ago

@brchiu,

I confirm that https://github.com/brchiu/asn1c/commit/b7a3f5878568e4f8949ff4458faa6716d478ff68 fixes all of the above problems. Many thanks.

@fdciotti

I see from your example of 'echo "E7 FF F8" | xxd -r -p | ./converter-example -iper -' that the decode of 'dummy' for 'sequence2' is 0x7F instead of 0xFF, due to the 0 extension bit erroneously being included as the MSB of the value.

Yes, I overlooked this, you are right that it was not decoded correctly.

velichkov commented 5 years ago

@brchiu

You probably have noticed it but one test is failing. Tthe expected output have to be regenerated - cd tests/tests-asn1c-compiler && ./check-parsing.sh regenerate

diff --git a/tests/tests-asn1c-compiler/32-sequence-of-OK.asn1.-P b/tests/tests-asn1c-compiler/32-sequence-of-OK.asn1.-P
index 4a7298b3..9d4d9502 100644
--- a/tests/tests-asn1c-compiler/32-sequence-of-OK.asn1.-P
+++ b/tests/tests-asn1c-compiler/32-sequence-of-OK.asn1.-P
@@ -161,7 +161,7 @@ asn_SEQUENCE_specifics_t asn_SPC_Error_specs_1 = {
     0,  /* No top level tags */
     0,  /* No tags in the map */
     0, 0, 0,    /* Optional elements (not needed) */
-    -1, /* First extension addition */
+    0,  /* First extension addition */
 };
 asn_TYPE_descriptor_t asn_DEF_Error = {
     "Error",
fdciotti commented 5 years ago

@brchiu , Many thanks from me as well! I confirmed that the patch resolves the issue in my simple ASN.1, as well as 38.331.

I see that you made the patch to the ansn1c-vlm_master branch, however that I was unable to use that branch with 38.331 as the resulting output files cause many redefinition errors when building my project (asn1c builds fine). Note that even with the asn1-master branch, there is one error during the 38.331 build due to how SetupRelease is defined, but that is easily resolved by including SetupRelease.h in that .c module.

What is the next step to get this change committed? Do we need issue a Pull Request?

brchiu commented 5 years ago

@velichkov , thank you for reminding, I was too sleepy then. I did not wait for CI result and fell into sleep. ^_^

@fdciotti, from my point of view, the only vlm master branch is in @vlm's repository. Community members, including me, only maintain his/her own local ones.

As for problem about you mentioned for 38.331, you might have a trial using my branch which focus on handling 3GPP related specification (not all, e.g. it can not handle some syntax used in NBAP) : https://github.com/brchiu/asn1c/commits/velichkov_s1ap_plus_option_group