P1sec / pycrate

A Python library to ease the development of encoders and decoders for various protocols and file formats; contains ASN.1 and CSN.1 compilers.
GNU Lesser General Public License v2.1
381 stars 132 forks source link

ASN.1 default value handling: to_jer() differences and warnings #118

Closed mbrehler closed 2 years ago

mbrehler commented 3 years ago

ASN.1 default values are generated as part of JER if the to_jer() is called before to_uper() while they are not generated afterwards, see example code below. I haven't double checked what the spec says about this (I suspect don't include defaults in JER). Either way, consistent behavior for to_jer() is probably what most people would expect. We actually like seeing the default values, so some control over it would be ideal.

One could also argue that the warnings / logs about "removing value equal to the default one" that the code generates are a bit surprising since in the example these values were not set/written.

Thanks, Matthias

Example: Need to compile NR RRC ASN.1 for the xample to work with pycrate_asn1compile.py -i pycrate\pycrate_asn1dir\3GPP_NR_RRC_38331\NR-RRC-Definitions.asn -o pycrate\pycrate_asn1dir\NR_RRC_Definitions

from binascii import hexlify, unhexlify
from pycrate_asn1dir import NR_RRC_Definitions

hexstr = b'0c810b5e40b2c831cc5fc20a0c2cd007456a0f07002179187409ce631b47e9cc0140f86d000006204089668077680503e1b40000188706044b338c41b8836dc2e0800e1ca0c001000100485c280000a10001e5d0f2406d4004bfc000000000602581b8241000000893140025c012c012cbc04892b26b2e009600965e024495935601d5204311d60000361130044b21f95348a1177c04892b264814766b4fb6c3000f1570060870000000c620dc41b92814000027dca0000000000000000002360044b214a05000009f72800000000000000000190040602e0813000132c0108840017614a05000009f7280000000000000000008d01174d852814000027dca00000000000000000320080c0501002bc000fa901ed278a0419dfecffaebdc9841273560acdf64667108c8031f108c8040b0a00100b1100500f8e0e090112056110010000010016a5c21b3a3a96ce92860103600a800'
msg = NR_RRC_Definitions.NR_RRC_Definitions.RRCReconfiguration
msg.from_uper(unhexlify(hexstr))
jer_before_encoding = msg.to_jer()
reencoded = msg.to_uper()
jer_after_encoding = msg.to_jer()
if (jer_before_encoding != jer_after_encoding):
    print('JER mismatches!\n')
    print('JER before encoding:\n')
    print(jer_before_encoding)
    print('JER after encoding:\n')
    print(jer_after_encoding)
p1-bmu commented 3 years ago

This is related to canonicity of the encoding. In PER, canonical encoding requires default values not to be encoded, which is the default behaviour of the PER codec in pycrate, see: https://github.com/P1sec/pycrate/blob/6b8691bfec1ce16851176260a34920e9952e23d5/pycrate_asn1rt/codecs.py#L51

With JER and JSON more generally being human-readable, there is no such thing as canonical encoding.

p1-bmu commented 3 years ago

In your case, setting ASN1CodecPER.CANONICAL = False before reencoding with UPER should answer your question.

mbrehler commented 3 years ago

Thanks. I was more interested in controlling how to_jer() behaves (so that the output is either always with default values or w/o, independent of other calls). Is that possible? Certainly easy enough to work around this by always ensuring a certain call order of from_uper/to_uper/to_jer.

p1-bmu commented 3 years ago

I try to better understand your need. To me, what you are looking for is not exactly related to the transfer syntax (JER, or PER), but more related to the user-side API of the asn.1 runtime. One could add 2 switches in the SEQUENCE / SET parent class:

Would this work for you ?

mbrehler commented 3 years ago

Thanks & I agree that I look at this more as an API issue: The output of to_jer() changes even though in my mind nothing should have changed in the message. Apparently the call to to_uper() has some side effect that make to_jer() behave differently which is unexpected (and caused us some grief since some regression tests expected consistent output). I do not follow you set_val/get_val proposal: My example doesn't call these.

p1-bmu commented 3 years ago

The call to to_uper() changes the content of the message because your PER encoder is configured to do canonical encoding. If you just want to change this behavior, so that your PER encoder also encodes DEFAULT values when they are present, try to set ASN1CodecPER.CANONICAL = False in your code first, and let me know if this solves your issue.