AdaCore / RecordFlux

Formal specification and generation of verifiable binary parsers, message generators and protocol state machines
Apache License 2.0
103 stars 6 forks source link

`KeyError: 'data'` converting IANA Constrained RESTful Environments (CoRE) Parameters #1297

Open mgrojo opened 1 month ago

mgrojo commented 1 month ago

As part of my master thesis, I have the intention of writing a CoAP library and an application in SPARK/Ada. Using RecordFlux seems thus as a good idea.

After trying to userflx convert iana for getting the conversion of number resources from the IANA specification for CoAP, I've got an exception in the tool.

Steps to reproduce

  1. Download the "Constrained RESTful Environments (CoRE) Parameters" in XML: https://www.iana.org/assignments/core-parameters/core-parameters.xml
  2. Run .venv/bin/rflx convert iana core-parameters.xml

Result

converter: warning: registry "Resource Type (rt=) Link Target Attribute Values" (id=rt-link-target-att-value) skipped due to conversion error: invalid literal for int() with base 10: 'core.gp'
converter: warning: registry "Interface Description (if=) Link Target Attribute Values" (id=if-link-target-att-value) skipped due to conversion error: invalid literal for int() with base 10: 'oic.if.baseline'
converter: warning: registry "CoAP Codes" (id=codes) skipped due to conversion error: invalid literal for int() with base 10: '0.00'
converter: warning: registry "CoAP Signaling Codes" (id=signaling-codes) skipped due to conversion error: invalid literal for int() with base 10: '7.00'
converter: warning: registry "CoAP Signaling Option Numbers" (id=signaling-option-numbers) skipped due to conversion error: invalid literal for int() with base 10: '7.01'
converter: warning: registry "Endpoint Type (et=) RD Parameter Values" (id=et-rd-parameter-value) skipped due to conversion error: invalid literal for int() with base 10: 'group'

------------------------------ RecordFlux Bug ------------------------------
RecordFlux 0.21.1.dev9+d8da4b23
attrs 23.2.0
defusedxml 0.7.1
icontract 2.6.6
importlib-resources 6.4.0
pydantic 2.6.4
pydotplus 2.0.2
pygls 1.3.0
ruamel.yaml 0.18.6
z3-solver 4.12.2.0

Optimized: False

Command: .venv/bin/rflx convert iana core-parameters.xml

Traceback (most recent call last):
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/cli.py", line 464, in main
    args.func(args)
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/cli.py", line 697, in convert_iana
    iana.convert(
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 77, in convert
    write_rflx_specification(
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 118, in write_rflx_specification
    file.write_text(package.rflx_str(), encoding="utf-8")
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 244, in rflx_str
    str_repr += "".join(r.rflx_str(indent_level + 1) for r in self.entries)
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 244, in <genexpr>
    str_repr += "".join(r.rflx_str(indent_level + 1) for r in self.entries)
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 264, in rflx_str
    formatted_enum_literals = ",\n".join(
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 265, in <genexpr>
    r.rflx_str(indent_level + 1) for r in self.enum_literals
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 319, in rflx_str
    comment = self.comment(indent_level)
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 297, in comment
    comments = [
  File "/home/mgr/src/github/AdaCore/RecordFlux/rflx/converter/iana.py", line 301, in <listcomp>
    else f"Ref: {c.attrib['data']}"
KeyError: 'data'

----------------------------------------------------------------------------

A bug was detected. Please report this issue on GitHub:

https://github.com/AdaCore/RecordFlux/issues/new?labels=bug

Include the complete content of the bug box shown above and all input files
in the report.

Workaround

After a bit of debugging, I've seen that removing this node in the XML, the exception is avoided and the rflx code is generated.


  <registry id="core-standard-problem-detail-keys">
    <title>Standard Problem Detail Keys</title>
    <xref type="rfc" data="rfc9290"/>
    <expert>Thomas Fossati, Carsten Bormann</expert>
    <registration_rule>Specification Required</registration_rule>

    <record date="2022-07-12">
      <value>-1</value>
      <name>title</name>
      <type>text / tag38</type>
      <description>Short, human-readable summary of the problem shape</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-2</value>
      <name>detail</name>
      <type>text / tag38</type>
      <description>Human-readable explanation specific to this occurrence of the problem</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-3</value>
      <name>instance</name>
      <type>~uri</type>
      <description>URI reference identifying specific occurrence of the problem</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-4</value>
      <name>response-code</name>
      <type>uint .size 1</type>
      <description>CoAP response code</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-5</value>
      <name>base-uri</name>
      <type>~uri</type>
      <description>Base URI</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-6</value>
      <name>base-lang</name>
      <type>tag38-ltag</type>
      <description>Base language tag (see <xref type="rfc" data="rfc9290">RFC9290, Appendix A</xref>)</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-12">
      <value>-7</value>
      <name>base-rtl</name>
      <type>tag38-direction</type>
      <description>Base writing direction (see <xref type="rfc" data="rfc9290">RFC9290, Appendix A</xref>)</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290"/>
    </record>
    <record date="2022-07-13">
      <value>-8</value>
      <name>unprocessed-coap-option</name>
      <type>one-or-more&lt;uint&gt;</type>
      <description>Option number(s) of CoAP option(s) that were not understood</description>
      <controller>IETF (core@ietf.org or art@ietf.org)</controller>
      <xref type="rfc" data="rfc9290">RFC9290, Section 3.1.1</xref>
    </record>
    <record date="2023-02-15">
      <value>-25</value>
      <name>request-body-error-position</name>
      <type>uint</type>
      <description>Byte index inside the request body at which the error became apparent</description>
      <controller><xref type="person" data="CoRE_WG"/></controller>
      <xref type="draft" data="draft-amsuess-core-pd-body-error-position-00"/>
    </record>
  </registry>
treiher commented 1 month ago

Thanks for letting us know! We will look into this issue.

As part of my master thesis, I have the intention of writing a CoAP library and an application in SPARK/Ada. Using RecordFlux seems thus as a good idea.

Great that you are considering using RecordFlux for this purpose! We took a look at the CoAP message format a while back. As I recall, its creators made some nasty design choices. While we aim to be able to specify all common communication protocols, it may be difficult or even impossible to fully specify all aspects of CoAP with RecordFlux at this stage. We would be very interested in your findings on this endeavor.

mgrojo commented 4 weeks ago

We took a look at the CoAP message format a while back. As I recall, its creators made some nasty design choices.

Thanks for the information.

Searching for CoAP in the repository, I've found an issue affecting the protocol (#61) and an attempted implementation in the coap branch.

While we aim to be able to specify all common communication protocols, it may be difficult or even impossible to fully specify all aspects of CoAP with RecordFlux at this stage. We would be very interested in your findings on this endeavor.

Besides the issue with the end markers, are there any other problems that I'll have to face? Does this mean that I could only define a partial model that would require later manual modification after the generation to work around the problems? I assume that, even with the need of some workarounds, the use of RecorFlux versus a direct implementation in SPARK is the best approach. What's your opinion on this?

treiher commented 3 weeks ago

Besides the issue with the end markers, are there any other problems that I'll have to face?

Another issue could be the special cases for the Option Length field, where there is a separate 8-bit or 16-bit length field depending on the value. Adding separate Option Value fields where the field length is calculated differently for these special cases might be an acceptable workaround.

Does this mean that I could only define a partial model that would require later manual modification after the generation to work around the problems? I assume that, even with the need of some workarounds, the use of RecorFlux versus a direct implementation in SPARK is the best approach. What's your opinion on this?

Modifying the generated code correctly could be very hard and I wouldn't recommend it. I think doing the modifications that are necessary to support a sequence with end marker (#61) could be doable, but it definitely wouldn't be simple. I wouldn't even try to solve the above-mentioned problem with the Option Length in this way.

A workaround could be to use a RecordFlux specification that omits details that cannot be represented and implement a manual parser for these parts. For example, you could use an opaque field for the Options | Marker | Payload part in the RecordFlux specification and use a manually implemented parser that parses an Option and checks if the Marker is present before parsing the next Option. For the actual parsing of a single Option, you could still try using a RecordFlux generated parser. This approach might be a viable workaround if you plan to interact with the generated message parsers/serializers with manually written code anyway.

mgrojo commented 3 weeks ago

Thanks for all the information. I'll take all of this into account when I start the implementation.