AdaCore / RecordFlux

Formal specification and generation of verifiable binary parsers, message generators and protocol state machines
Apache License 2.0
103 stars 6 forks source link

Generate types instead of generic packages for sessions #768

Closed treiher closed 2 years ago

treiher commented 2 years ago

Context and Problem Statement

Currently, for each session, a generic package is generated. The parameters of a session (channels and functions) are mapped to formal parameters of the generic package. This may not be the optimal solution.

Considered Options

O1 Use types instead of generic packages

A type could be used to hold the state of the session state machine. The type needs to be passed to each subprogram which interacts with the state machine (e.g., Run for executing the state machine). The session parameters (channels and functions) could probably be represented by abstract methods of a tagged type.

See https://github.com/senier/spark_mixins_experiment for a SPARK example.

+ No need for global variables + Lower binary size for parallel state machines

O2 Keep generic packages

+ No changes necessary Multiple instances of generic package increase binary size

Decision Outcome

O1

treiher commented 2 years ago

The transition from generic packages to tagged types led to unprovable code in some cases. The reason seems to be an extremely increased proof time. That is not only the case for the DHCP client example, but also for some simpler test cases like tests/integration/feature_test.py::test_provability[session_comprehension_on_sequence], which failed after running for about one hour. Some test cases can be proved in a reasonable time. @kanigsson Could you please have a look at branch issue_768 to find out if there is fundamental issue in the generated code or if there is a way to improve the situation?

treiher commented 2 years ago

@kanigsson Proving the feature test mentioned above fails with the following message using the default configuration:

rflx-test-session.adb:113:08: medium: postcondition might fail, cannot prove Test.Session_Allocator.Global_Allocated (Ctx.P.Slots)
  113 |       Initialized (Ctx)
      |       ^~~~~~~~~~~~~~~~
kanigsson commented 2 years ago

I noticed that for many VCs, Z3 is needed, but Z3 invariably takes close to 40s to prove the goal. Looking closer, I found that even if I introduce a typing error into the SMT file, Z3 still takes 40s to report the typing error, which makes me suspect a bug/inefficiency in the preprocessing of the VC, not the actual proving.

The version of Z3 in SPARK 22.x is 4.8.12. Subsequent versions of Z3 such as 4.8.14 don't have that bug and prove most of those VCs immediately (I tried the VCs in rflx-test-session.adb, and Z3 4.8.14 proves all of them, most of them immediately, the longest being 10s).

Independently, I found SPARK wavefront (23.x) to be slightly better than 22.x, in that CVC4 proves slightly more goals. But it still contain the same faulty z3 version 4.8.12, so when Z3 is needed, we take the 40s penalty (per VC).

If we update the Z3 version in SPARK wavefront, could you upgrade the Recordflux CI to SPARK wavefront? Alternatively, we could "side-load" the newer Z3, but every user of Recordflux would have to do the same.

treiher commented 2 years ago

As discussed separately, it is fine to only support a SPARK wavefront as a temporary solution, but we count on supporting a SPARK Community with a fixed Z3 version for the next RecordFlux release (probably in Q2). In cases where a wavefront is not acceptable, we could still use the side-loading of a newer Z3 as an alternative.

kanigsson commented 2 years ago

OK, I am preparing the update to Z3 4.8.14 in SPARK wavefronts. It might take several days. In the meantime the easiest is probably to replace the z3 in the SPARK install by a newer version by hand, so you can continue your work.

treiher commented 2 years ago

Tagged types seem to require elaboration code. I get an error: violation of restriction "NO_ELABORATION_CODE" for the newly introduced tagged type. If I remove the tagged from the record, the error vanishes.

senier commented 2 years ago

I wouldn't think they generally do. @jklmnn Do you know this?

jklmnn commented 2 years ago

So I just took a look and creating a tagged type seems to create a lot of code (not too surprising, as we need dynamic dispatching). I have the following unit:

package Tag2
is

   type R is null record;

end Tag2;

This generates the following output:

obj/tag2.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <tag2__rIP>:
tag2__rIP():
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   5d                      pop    %rbp
   5:   c3                      retq

With R being tagged I get the following:

obj/tag2.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <tag2__rIP>:
tag2__rIP():
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   8:   89 f0                   mov    %esi,%eax
   a:   88 45 f4                mov    %al,-0xc(%rbp)
   d:   80 7d f4 00             cmpb   $0x0,-0xc(%rbp)
  11:   74 0c                   je     1f <tag2__rIP+0x1f>
  13:   ba 00 00 00 00          mov    $0x0,%edx
  18:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  1c:   48 89 10                mov    %rdx,(%rax)
  1f:   5d                      pop    %rbp
  20:   c3                      retq   
  21:   90                      nop

0000000000000022 <tag2___size>:
tag2___size():
  22:   55                      push   %rbp
  23:   48 89 e5                mov    %rsp,%rbp
  26:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  2a:   b8 40 00 00 00          mov    $0x40,%eax
  2f:   5d                      pop    %rbp
  30:   c3                      retq   
  31:   90                      nop

0000000000000032 <tag2__rPI>:
tag2__rPI():
  32:   55                      push   %rbp
  33:   48 89 e5                mov    %rsp,%rbp
  36:   53                      push   %rbx
  37:   48 83 ec 18             sub    $0x18,%rsp
  3b:   48 89 7d e8             mov    %rdi,-0x18(%rbp)
  3f:   48 89 75 e0             mov    %rsi,-0x20(%rbp)
  43:   b8 00 00 00 00          mov    $0x0,%eax
  48:   ba 00 00 00 00          mov    $0x0,%edx
  4d:   48 89 c1                mov    %rax,%rcx
  50:   48 89 d3                mov    %rdx,%rbx
  53:   48 8b 45 e8             mov    -0x18(%rbp),%rax
  57:   48 89 ce                mov    %rcx,%rsi
  5a:   48 89 c7                mov    %rax,%rdi
  5d:   e8 00 00 00 00          callq  62 <tag2__rPI+0x30>
  62:   48 8b 5d f8             mov    -0x8(%rbp),%rbx
  66:   c9                      leaveq 
  67:   c3                      retq   

0000000000000068 <tag2__rSR>:
tag2__rSR():
  68:   55                      push   %rbp
  69:   48 89 e5                mov    %rsp,%rbp
  6c:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  70:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  74:   89 55 ec                mov    %edx,-0x14(%rbp)
  77:   5d                      pop    %rbp
  78:   c3                      retq   
  79:   90                      nop

000000000000007a <tag2__rSW>:
tag2__rSW():
  7a:   55                      push   %rbp
  7b:   48 89 e5                mov    %rsp,%rbp
  7e:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  82:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  86:   89 55 ec                mov    %edx,-0x14(%rbp)
  89:   5d                      pop    %rbp
  8a:   c3                      retq   
  8b:   90                      nop

000000000000008c <tag2__rSI>:
tag2__rSI():
  8c:   55                      push   %rbp
  8d:   48 89 e5                mov    %rsp,%rbp
  90:   53                      push   %rbx
  91:   48 83 ec 28             sub    $0x28,%rsp
  95:   48 89 7d d8             mov    %rdi,-0x28(%rbp)
  99:   89 75 d4                mov    %esi,-0x2c(%rbp)
  9c:   b8 02 00 00 00          mov    $0x2,%eax
  a1:   83 7d d4 02             cmpl   $0x2,-0x2c(%rbp)
  a5:   0f 4e 45 d4             cmovle -0x2c(%rbp),%eax
  a9:   89 c3                   mov    %eax,%ebx
  ab:   48 8d 45 e8             lea    -0x18(%rbp),%rax
  af:   be 01 00 00 00          mov    $0x1,%esi
  b4:   48 89 c7                mov    %rax,%rdi
  b7:   e8 00 00 00 00          callq  bc <tag2__rSI+0x30>
  bc:   48 8d 4d e8             lea    -0x18(%rbp),%rcx
  c0:   48 8b 45 d8             mov    -0x28(%rbp),%rax
  c4:   89 da                   mov    %ebx,%edx
  c6:   48 89 ce                mov    %rcx,%rsi
  c9:   48 89 c7                mov    %rax,%rdi
  cc:   e8 00 00 00 00          callq  d1 <tag2__rSI+0x45>
  d1:   bf 08 00 00 00          mov    $0x8,%edi
  d6:   e8 00 00 00 00          callq  db <tag2__rSI+0x4f>
  db:   48 89 c1                mov    %rax,%rcx
  de:   48 8b 55 e8             mov    -0x18(%rbp),%rdx
  e2:   48 89 11                mov    %rdx,(%rcx)
  e5:   48 8b 5d f8             mov    -0x8(%rbp),%rbx
  e9:   c9                      leaveq 
  ea:   c3                      retq   
  eb:   90                      nop

00000000000000ec <tag2__rSO>:
tag2__rSO():
  ec:   55                      push   %rbp
  ed:   48 89 e5                mov    %rsp,%rbp
  f0:   48 83 ec 20             sub    $0x20,%rsp
  f4:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  f8:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
  fc:   89 55 ec                mov    %edx,-0x14(%rbp)
  ff:   b8 02 00 00 00          mov    $0x2,%eax
 104:   83 7d ec 02             cmpl   $0x2,-0x14(%rbp)
 108:   0f 4e 45 ec             cmovle -0x14(%rbp),%eax
 10c:   89 c2                   mov    %eax,%edx
 10e:   48 8b 4d f0             mov    -0x10(%rbp),%rcx
 112:   48 8b 45 f8             mov    -0x8(%rbp),%rax
 116:   48 89 ce                mov    %rcx,%rsi
 119:   48 89 c7                mov    %rax,%rdi
 11c:   e8 00 00 00 00          callq  121 <tag2__rSO+0x35>
 121:   c9                      leaveq 
 122:   c3                      retq   
 123:   90                      nop

0000000000000124 <tag2__Oeq>:
tag2__Oeq():
 124:   55                      push   %rbp
 125:   48 89 e5                mov    %rsp,%rbp
 128:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
 12c:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
 130:   b8 01 00 00 00          mov    $0x1,%eax
 135:   5d                      pop    %rbp
 136:   c3                      retq   
 137:   90                      nop

0000000000000138 <tag2___assign>:
tag2___assign():
 138:   55                      push   %rbp
 139:   48 89 e5                mov    %rsp,%rbp
 13c:   48 81 ec 18 02 00 00    sub    $0x218,%rsp
 143:   48 89 bd 78 fd ff ff    mov    %rdi,-0x288(%rbp)
 14a:   48 89 b5 70 fd ff ff    mov    %rsi,-0x290(%rbp)
 151:   48 8b 95 78 fd ff ff    mov    -0x288(%rbp),%rdx
 158:   48 8b 85 70 fd ff ff    mov    -0x290(%rbp),%rax
 15f:   48 39 c2                cmp    %rax,%rdx
 162:   74 28                   je     18c <tag2___assign+0x54>
 164:   48 8b 85 78 fd ff ff    mov    -0x288(%rbp),%rax
 16b:   48 8b 10                mov    (%rax),%rdx
 16e:   48 8b 85 78 fd ff ff    mov    -0x288(%rbp),%rax
 175:   48 8b 8d 70 fd ff ff    mov    -0x290(%rbp),%rcx
 17c:   48 8b 09                mov    (%rcx),%rcx
 17f:   48 89 08                mov    %rcx,(%rax)
 182:   48 8b 85 78 fd ff ff    mov    -0x288(%rbp),%rax
 189:   48 89 10                mov    %rdx,(%rax)
 18c:   c9                      leaveq 
 18d:   c3                      retq   

000000000000018e <tag2__rDA>:
tag2__rDA():
 18e:   55                      push   %rbp
 18f:   48 89 e5                mov    %rsp,%rbp
 192:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
 196:   89 f0                   mov    %esi,%eax
 198:   88 45 f4                mov    %al,-0xc(%rbp)
 19b:   5d                      pop    %rbp
 19c:   c3                      retq   
 19d:   90                      nop

000000000000019e <tag2__rDF>:
tag2__rDF():
 19e:   55                      push   %rbp
 19f:   48 89 e5                mov    %rsp,%rbp
 1a2:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
 1a6:   89 f0                   mov    %esi,%eax
 1a8:   88 45 f4                mov    %al,-0xc(%rbp)
 1ab:   5d                      pop    %rbp
 1ac:   c3                      retq   
 1ad:   90                      nop

00000000000001ae <tag2__TrCFD>:
tag2__TrCFD():
 1ae:   55                      push   %rbp
 1af:   48 89 e5                mov    %rsp,%rbp
 1b2:   48 83 ec 30             sub    $0x30,%rsp
 1b6:   48 89 7d d8             mov    %rdi,-0x28(%rbp)
 1ba:   48 8b 45 d8             mov    -0x28(%rbp),%rax
 1be:   48 8b 00                mov    (%rax),%rax
 1c1:   48 89 45 f8             mov    %rax,-0x8(%rbp)
 1c5:   48 8b 45 f8             mov    -0x8(%rbp),%rax
 1c9:   48 89 45 f0             mov    %rax,-0x10(%rbp)
 1cd:   48 8b 45 f0             mov    -0x10(%rbp),%rax
 1d1:   48 c7 45 e8 18 00 00    movq   $0x18,-0x18(%rbp)
 1d8:   00 
 1d9:   48 8b 55 e8             mov    -0x18(%rbp),%rdx
 1dd:   48 29 d0                sub    %rdx,%rax
 1e0:   48 8b 00                mov    (%rax),%rax
 1e3:   48 8b 40 40             mov    0x40(%rax),%rax
 1e7:   48 8b 55 d8             mov    -0x28(%rbp),%rdx
 1eb:   48 89 c1                mov    %rax,%rcx
 1ee:   83 e1 01                and    $0x1,%ecx
 1f1:   48 85 c9                test   %rcx,%rcx
 1f4:   74 08                   je     1fe <tag2__TrCFD+0x50>
 1f6:   4c 8b 50 ff             mov    -0x1(%rax),%r10
 1fa:   48 8b 40 07             mov    0x7(%rax),%rax
 1fe:   be 01 00 00 00          mov    $0x1,%esi
 203:   48 89 d7                mov    %rdx,%rdi
 206:   ff d0                   callq  *%rax
 208:   c9                      leaveq 
 209:   c3                      retq   

000000000000020a <tag2__finalize_spec>:
tag2__finalize_spec():
 20a:   55                      push   %rbp
 20b:   48 89 e5                mov    %rsp,%rbp
 20e:   b8 00 00 00 00          mov    $0x0,%eax
 213:   48 89 c7                mov    %rax,%rdi
 216:   e8 00 00 00 00          callq  21b <tag2__finalize_spec+0x11>
 21b:   5d                      pop    %rbp
 21c:   c3                      retq   
 21d:   90                      nop

000000000000021e <tag2___elabs>:
tag2___elabs():
/.../tag2.ads:1
package Tag2
 21e:   55                      push   %rbp
 21f:   48 89 e5                mov    %rsp,%rbp
/.../tag2.ads:4
is

   type R is tagged null record;
 222:   bf 00 00 00 00          mov    $0x0,%edi
 227:   e8 00 00 00 00          callq  22c <tag2___elabs+0xe>
 22c:   b8 00 00 00 00          mov    $0x0,%eax
 231:   48 89 c7                mov    %rax,%rdi
 234:   e8 00 00 00 00          callq  239 <tag2___elabs+0x1b>
/.../tag2.ads:6

end Tag2;
 239:   5d                      pop    %rbp
 23a:   c3                      retq

This also includes elaboration code (probably to build the vtable?). Note that this has been compiled without optimizations, so the optimized code might be a lot smaller.

senier commented 2 years ago

This also includes elaboration code (probably to build the vtable?).

Can you please check (with the experts) whether tagged types always require elaboration code? My point was: we just got rid of it for a reason and having to introduce elaboration code again would be somewhat annoying...

jklmnn commented 2 years ago

pragma Restrictions (No_Elaboration_Code) can be combined with pragma Restrictions (No_Tagged_Type_Registration). This seems to require a fairly new GNAT (I tested it with a wavefront, you may at least need 22.0).

treiher commented 2 years ago

The No_Tagged_Type_Registration restriction is neither supported by 22.0 nor by the RC of 22.1. We currently do not support 23.0w. @senier Should #905 be prioritized to be able to use that restriction?

senier commented 2 years ago

@senier Should #905 be prioritized to be able to use that restriction?

Yes, we'll likely also need it for #908...

treiher commented 2 years ago

The SPARK wavefront leads to a few unproven VCs that are unrelated to the introduced tagged types. For example:

rflx-rflx_generic_types.adb:279:25: medium: assertion might fail, in instantiation at rflx-custom_types_tests.adb:75
  279 |         pragma Assert (LME_Size = Value_Size - RME_Size - Natural (RME_Index - LME_Index - 1) * Byte'Size);
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There is also a bug box in multiple tests:

+===========================GNAT BUG DETECTED==============================+
| Pro 23.0w (20220128) (spark)
                                                                           |
| No source file position information available                            |
| Compiling /home/runner/work/RecordFlux/RecordFlux/tests/spark/generated/rflx-ipv4-packet.adb|

A full list of issues can be found here: https://github.com/Componolit/RecordFlux/actions/runs/1773673788.

@kanigsson Could you please take care of that? That is the highest priority at the moment.

kanigsson commented 2 years ago

For the bugbox, I think it's related to memory consumption. I put a simple fix in review which should fix at least the in_ethernet test, on ticket #920 (branch issue_920).

kanigsson commented 2 years ago

For the failed proof in the bit manipulation code, I pushed a change on issue #921 (branch issue_921).

kanigsson commented 2 years ago

As I said, the bug boxes are likely due to running out of memory (I saw processes consuming up to 15GB ...). I traced this back to a change we made in november, where SPARK inlines predicates more aggressively. This doesn't work at all with the large predicates that RecordFlux generates. I will discuss this with the SPARK team.