P1sec / pycrate

A Python library to ease the development of encoders and decoders for various protocols and file formats; contains ASN.1 and CSN.1 compilers.
GNU Lesser General Public License v2.1
384 stars 132 forks source link

Parametrization bug when using WITH COMPONENTS #100

Closed JNevrly closed 4 years ago

JNevrly commented 4 years ago

Following (simplified example) ASN.1 construct compiles OK, but raises error during import of the compiled module:

ParametrizationTest DEFINITIONS AUTOMATIC TAGS ::= BEGIN

BaseEntity ::= SEQUENCE {
  itemA INTEGER,
  itemB INTEGER,
  itemC OCTET STRING (SIZE(32)),
  ...
}

ParametrizedEntity{Parameter} ::= BaseEntity (WITH COMPONENTS {...,
  itemC (CONSTRAINED BY {-- whatever constraint -- Parameter})
  })

ConstrainingEntity ::= SEQUENCE {
    dataA INTEGER,
    ...
}

TestEntity ::= SEQUENCE {
  itemA ParametrizedEntity{ConstrainingEntity},
  itemB BaseEntity
}

END

The error on import:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-31271c137406> in <module>
----> 1 import parametrization_test

~/SW/pycrate-coer/scratchbook/parametrization_test.py in <module>
     87     ]
     88 
---> 89 init_modules(ParametrizationTest)

~/SW/pycrate-coer/pycrate_asn1rt/init.py in init_modules(*args, **kwargs)
    246             # set a list of tags for the content
    247             # try:
--> 248             Obj._cont_tags = [Cont._tagc[0] if Cont._tagc else None for Cont in Obj._cont.values()]
    249             # except AttributeError:
    250             #     Obj._cont_tags = []

AttributeError: 'NoneType' object has no attribute 'values'

The error does not happen when ParametrizedEntity is not derived from other entity using WITH COMPONENTS constraints, but using CONSTRAINED BY on direct type definition. Since this seems to go quite deep into the PyCrate's ASN.1 representation, I can't really figure out what's going on internally.

Btw. I've found out this problem while trying to compile ETSI ITS Security headers ASN.1 ... (there was plenty of other problems, but all except this one can be solved by reformatting the source files).

p1-bmu commented 4 years ago

It seems you are reaching the limit of the asn1 part of pycrate here ! 1) CONSTRAINED BY constraints are ignored by the compiler (see https://github.com/P1sec/pycrate/blob/4b917717b1b0d520c90c4e0d50e3cfc34b88ce09/pycrate_asn1c/asnobj.py#L4530). 2) WITH COMPONENTS constraints are compiled, but ignored by the code generator (see https://github.com/P1sec/pycrate/blob/7c143f814e099fc7eb90e7f533bb374c3965e8f5/pycrate_asn1c/generator.py#L677 for SEQUENCE / SET, and this is also the case for REAL, CHOICE and few others).

On my side, I never found any ASN.1 specs with meaningful CONSTRAINED BY pattern. I have also never found ASN.1 specs with WITH COMPONENTS patterns that have impact on the encoding: all of which I know use BER, therefore tagging avoid having to care for those constraints. In the current state, the compiler and code generator should deliver a final object similar to:

TestEntity ::= SEQUENCE {
  itemA BaseEntity,
  itemB BaseEntity
}

However, it seems a bug is effectively lying in the code generator or the runtime initializer.

JNevrly commented 4 years ago

Thanks for the explanation! Regarding the ASN in question, it's https://forge.etsi.org/rep/ITS/asn1/sec_ts103097/blob/master/EtsiTs103097Module.asn I'm not on the level to decide whether the CONSTRAINED BY usage there is meaningful or not, I'm a humble user ;).

Anyways, this particular ASN is all for COER (in my naivety, I also took on the task to write a OER/COER codec - and you were not wrong about the amount of work it requires... - will come back with PR when the deed is done)

p1-bmu commented 4 years ago

According to ITU-T X.682, CONSTRAINED BY are for user-defined constraints, which are not "fully machine processable" and can be seen as a special form of comment. WITH COMPONENTS is to dispatch additional constraints to inner components of a SEQUENCE or SET, and eventually to enforce the presence or absence of inner components.

I'll check what I can do to support the python code generation for this kind of tortured specification...

p1-bmu commented 4 years ago

Commit https://github.com/P1sec/pycrate/commit/16dc1ebe6632ccc762242193333334678326f918 should solve the compilation and module loading issue. I regenerated all ASN.1 modules. This should help you continue working on your development. In case you setup a recent and working ITS ASN.1 directory, do not hesitate to do a PR, as we have already this issue opened: https://github.com/P1sec/pycrate/issues/94. Thanks

JNevrly commented 4 years ago

I can confirm that 16dc1eb fixes the issue for the ASN.1 specs on the ITS security headers - can be compiled and used. Thanks a lot for a quick fix!

I'm currently having only the ITS security headers ASN.1. Since the security headers are meant exclusively for COER (and that is still in the work), I will first try to do the OER/COER PR and then the ITS security headers PR.

For ITS message ASN.1 (that uses UPER for encoding), I don't have it in PRable form now, but I have managed to used Pycrate to compile and use it in the past (currently obsolete versions) without serious problems.

p1-bmu commented 4 years ago

OK, let's close this particular issue.

JNevrly commented 4 years ago

I actually found out another issue with parameterization, this time with CONTAINING. The effect is exactly the same as the original issue. Replication ASN.1:

ParametrizationTest DEFINITIONS AUTOMATIC TAGS ::= BEGIN

BaseEntity ::= SEQUENCE {
  itemA INTEGER,
  itemB INTEGER,
  itemC OCTET STRING (SIZE(32)),
  ...
}

ParametrizedEntity{Parameter} ::= BaseEntity (WITH COMPONENTS {...,
  itemC (CONTAINING Parameter)
  })

ConstrainingEntity ::= SEQUENCE {
    dataA INTEGER,
    ...
}

TestEntity ::= SEQUENCE {
  itemA ParametrizedEntity{ConstrainingEntity},
  itemB BaseEntity
}

END

Unlike CONSTRAINED BY, it seems to me that CONTAINING constraint is actually processed, but I can't really follow the logic deeper.

And as usual, I did not make this ASN.1 construct up, it appears in the ITS security part that I forgot to compile when testing the original issue fix: https://forge.etsi.org/rep/ITS/asn1/pki_ts102941/blob/master/EtsiTs102941MessagesItss.asn#L52 parametrizing https://forge.etsi.org/rep/ITS/asn1/sec_ts103097/blob/master/EtsiTs103097Module.asn#L65

Sorry that I didn't found that out while verifying the original fix... just omitted by accident the most top-level file :(

p1-bmu commented 4 years ago

The few commits pushed today introduced a proper support for WITH COMPONENTS constraints. I think (/ hope) this solves the issue: let me know on your side.

JNevrly commented 4 years ago

Thanks for a quick fix, but unfortunately I can't verify if it fixes this issue, because it introduces new issue, this time during compilation. The error report:

WNG: IEEE1609dot2.SignedDataPayload: multiple root parts in WITH COMPONENTS constraint, only stacking presence
WNG: IEEE1609dot2.toBeSigned: multiple WITH COMPONENTS constraints, compiling only the first
WNG: IEEE1609dot2.verifyKeyIndicator: multiple WITH COMPONENTS constraints, compiling only the first
WNG: IEEE1609dot2.ToBeSignedCertificate: multiple root parts in WITH COMPONENTS constraint, only stacking presence
WNG: IEEE1609dot2.verifyKeyIndicator: multiple WITH COMPONENTS constraints, compiling only the first
WNG: EtsiTs103097Module.toBeSigned: multiple WITH COMPONENTS constraints, compiling only the first
WNG: EtsiTs103097Module.verifyKeyIndicator: multiple WITH COMPONENTS constraints, compiling only the first
Traceback (most recent call last):
  File "/home/josef/.venvs/asn1-coer/bin/pycrate_asn1compile.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/home/josef/SW/pycrate-coer/tools/pycrate_asn1compile.py", line 258, in <module>
    sys.exit(main())
  File "/home/josef/SW/pycrate-coer/tools/pycrate_asn1compile.py", line 220, in main
    generate_modules(generator_class, args.output + '.py')
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/asnproc.py", line 1231, in generate_modules
    generator(destfile)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 43, in __init__
    self.gen()
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 368, in gen
    self.gen_mod(GLOBAL.MOD[mod_name])
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 408, in gen_mod
    self.gen_type(Obj)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 510, in gen_type
    elif Obj.TYPE in (TYPE_SEQ, TYPE_SET)           : self.gen_type_seq(Obj)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 692, in gen_type_seq
    self.gen_type(Cont, compts=True)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 510, in gen_type
    elif Obj.TYPE in (TYPE_SEQ, TYPE_SET)           : self.gen_type_seq(Obj)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 692, in gen_type_seq
    self.gen_type(Cont, compts=True)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 508, in gen_type
    elif Obj.TYPE == TYPE_CHOICE                    : self.gen_type_choice(Obj)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 659, in gen_type_choice
    self.gen_const_comps(Obj)
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/generator.py", line 959, in gen_const_comps
    del Obj._cont[ident]
  File "/home/josef/SW/pycrate-coer/pycrate_asn1c/dictobj.py", line 79, in __delitem__
    del self._dict[key]
KeyError: 'verificationKey'

The problematic bit is this part of the ASN.1: https://forge.etsi.org/rep/ITS/asn1/ieee1609.2/blob/master/IEEE1609dot2.asn#L247

This issue appears only after 09dc57f21917425707f2afedb84770431ebf65fc

JNevrly commented 4 years ago

Trying to understand the logic of the compiler, I guess the reason is that either:

  1. CHOICE type items get incorrectly marked as absent here.
  2. The treatment of absent components does not correctly treat CHOICE and just should check if the ident is in the _cont before trying to delete it.

However, even if I assume (2.) and just try to handle the exception, another compilation issue will pop up, that goes way beyond my grasp of this codebase. Therefore I'm just attaching the source files for the ITS security, that I was able to compile prior to 09dc57f.

its_security_reference.tar.gz

Command used for the compilation (from the unzipped directory):

pycrate_asn1compile.py -i IEEE1609dot2-2016a/*.asn TS103097_v1_3_1/*.asn TS102941_v1_3_1/*.asn -o security
p1-bmu commented 4 years ago

OK, I pushed few fixes. This helps to support WITH COMPONENTS constraint. At least the IEEE1609dot2 module can be compiled correctly.

This ITS specification case is however one of the biggest mess I ever checked in terms of ASN.1. We have:

Seriously, I am not going to add support for those constructs in pycrate anytime soon (too much work, too hard to debug) ; and I seriously doubt any free / open-source ASN.1 compiler would support those specs. Beside, I am curious to know if someone can successfully compile it with any open-source project (maybe the Erlang compiler ?).

Therefore I think your best chance here is:

JNevrly commented 4 years ago

Thanks for the quick fix again. I can confirm IEEE1609dot2 compiles and works (tested with COER codec against examples in IEEE 1609.2a specification). Anything above that does not compile in its current form.

TLDR Bad bug discovered

When I reduce the TS103097 to only EtsiTs103097Certificate, it compiles and runtime init is OK. I get runtime errors during decoding though, that's new and seems like a regression, because before the decoding worked fine (I used to be stopped on #109 which was only raised after successfull decode).

After bit of digging, it seems like that the new compilation method of WITH COMPONENTS in the recent commits results in different sets of mandatory (_root_mand) and optional (_root_opt) items. E.g. for EtsiTs103097Certificate, the original WITH COMPONENTS compilation had signature item from Ieee1609Dot2.ExplicitCertificate (which has PRESENT constraint) in _root_opt while the new one has everything in _root_mand. In that particular case it results in empty (zero-length) root preamble (no extension, everything is mandatory), so the decoder is offset by one byte and everything fails.

Now I don't really understand the implementation meaning of PRESENT/ABSENT flags in WITH COMPONENT, but isn't it something that should be considered for validation, but not for encoding/decoding? ASN.1 playground and the valid messages I have here for testing certainly behave that way.

In any case, I guess this is a regression that can have quite far-reaching consequences in other encodings like PER, and sadly it may not be captured by the PyCrate unit tests as there is currently no test SEQUENCE that would manifest that behaviour.

The rest

For the general ITS ASN.1 hate :wink:, I concur. I'll see what I can do. Even with just the EtsiTs103097Certificate I can do some stuff I need for my use-case, so with this bug fixed I'm partially good to go.

I've heard someone got the complete EtsiTs103097 working with some open-source compiler but I don't know the details (will check that and report). Commercial compiler is not out of the question for me, but my use case benefits from PyCrate's straightforward decoding into Python's dicts.

p1-bmu commented 4 years ago

Regarding the PRESENT / ABSENT flags, I don't either fully understand how to implement it ! ITU-T X.680 is quite cryptic and unclear (like often...) about it ; and regarding PER, no spec using this codec are implementing WITH COMPONENTS constraints hopefully :). So you may be right that simply removing component (or making them mandatory) is possibly wrong.

To sum up: PRESENT / ABSENT => should be implemented as a constraint in the runtime (and not as a change of objects' content), that should be verified before encoding and after decoding. Additional component constraint => should be implemented as an additional constraint in the runtime. Multiple WITH COMPONENTS alternatives => should be implemented as alternative constraints to be verified sequentially in the runtime, to ensure one of them is verified.

After all, the whole WITH COMPONENTS content should not be reduced / simplified, but transcribed almost as is, and entirely supported in the runtime. I'll see how I can do that correctly. At this time, I only reverted some part of the code generation process to not alter objects content. On your side, if you learn about any open-source compiler that fully support those ETSI ITS specs: please let me know.

JNevrly commented 4 years ago

I can confirm the last commit with reverted changes works correctly/as expected. Now I can compile, use and decode EtsiTs103097Certificate structure with the master branch. Thanks a lot for a quick action! I think this issue can be closed.

For the more general discussion about ITS specification parsing, I think it can be moved to #94 so it doesn't get lost. I think asn1c was able to compile it, but haven't tried myself yet. I'll try and report.