GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
442 stars 63 forks source link

Bump crucible to get the Const_RefRoots fix. #2107

Closed sauclovian-g closed 2 months ago

sauclovian-g commented 2 months ago

Includes the test from the issue.

Fixes #2064.

RyanGlScott commented 2 months ago

Looking good!

Note that the MIR-related test cases in this repo have checked in their MIR JSON files to version control so that the test suite can be ran without needing to install mir-json beforehand. The downside to this approach is that these JSON files are still using the old mir-json schema (prior to the changes in https://github.com/GaloisInc/mir-json/pull/73 and https://github.com/GaloisInc/crucible/pull/1243), so they will no longer parse with the new version of crucible-mir. This can be seen in:

  1. The saw integration tests (failing CI log here)
  2. The saw-remote-api tests (failing CI log here)

(I'm a bit undecided about whether we want to check in all of the MIR JSON files to version control going forward, but that's the way it's currently done.)

In both cases, the most direct solution would be to generate the MIR JSON files. They are located in the following places:

  1. In the saw integration tests: All of the directories that are shown when running the find intTests/ -iname "*.linked-mir.json" command
  2. In the saw-remote-api tests: In this directory

In each location, there should be a Makefile that can be used to regenerate the JSON files. Please let me know if you have any difficulties.

sauclovian-g commented 2 months ago

Bleah, I should have thought of that. After all, I noticed the json files were checked in...

(btw there's a third set in the rust tutorial)

It occurs to me, also, that being able to diff them after a change like this might be worth the space overhead of including newlines in the output...

RyanGlScott commented 2 months ago

Ah, one last thing: you'll have to update these lines to reflect the new disambiguator values:

 // Test using fully disambiguated names
-mir_verify m "test/775505e0::id_u8" [] false (id_spec mir_u8) z3;
-mir_verify m "test/775505e0::id_u8[0]" [] false (id_spec mir_u8) z3;
+mir_verify m "test/ff7472fa::id_u8" [] false (id_spec mir_u8) z3;
+mir_verify m "test/ff7472fa::id_u8[0]" [] false (id_spec mir_u8) z3;
sauclovian-g commented 2 months ago

I have done so. I wouldn't have thought this would have changed much of anything here, though. So I have investigated, which may or may not have been worthwhile; the change from 775505e0 to ff7472fa covers the whole compilation and isn't particularly interesting.

What's different though is that the original has two extra copies of id_array, specialized to take no args and return the array size (one each for the argument type and return type sizes, even though they're the same); these have disappeared. Is this because of a different compiler version (or just because it's Olde), which probably doesn't matter, or reflect something about compiler options or modes, which may?

RyanGlScott commented 2 months ago

tl;dr These changes are benign.

What's different though is that the original has two extra copies of id_array, specialized to take no args and return the array size (one each for the argument type and return type sizes, even though they're the same); these have disappeared.

In more concrete terms: before this PR, id_array was compiled to the following MIR:

fn test/775505e0::id_array[0](_1 : [u32; 5]) -> [u32; 5] {
   let mut _0 : [u32; 5];
   bb0: {
      _0 = use(_1);
      return;
   }
}
fn test/775505e0::id_array[0]::{constant#0}[0]() -> usize {
   let mut _0 : usize;
   bb0: {
      _0 = use(5);
      return;
   }
}
fn test/775505e0::id_array[0]::{constant#1}[0]() -> usize {
   let mut _0 : usize;
   bb0: {
      _0 = use(5);
      return;
   }
}

I believe id_array[0]::{constant#0} and id_array[0]::{constant#1} represent the lengths of the first and second array types, respectively, in id_array's type signature. In this small program, they are not referenced anywhere else in the generated MIR, although it is conceivable that one could cook up a more complicated program where they are needed.

After this PR, however, neither id_array[0]::{constant#0} and id_array[0]::{constant#1} appear at all in the generated MIR. This is expected behavior, and these changes are benign. These changes aren't due to https://github.com/GaloisInc/mir-json/pull/73, but rather due to a different commit (https://github.com/GaloisInc/mir-json/pull/56) that was brought in by recompiling this program with a more recent version of mir-json. (That commit didn't change the JSON schema in any way, so we didn't bother to recompile SAW's test cases at the time.)

As for why that commit changes the MIR JSON output: https://github.com/GaloisInc/mir-json/pull/56 causes saw-rustc only to mark functions as roots when performing dead-code elimination. Since id_array[0]::{constant#0} and id_array[0]::{constant#1} are considered top-level constants rather than functions at the MIR level, they are no longer marked as roots. Moreover, since no other definition references them, they are now discarded during mir-json's dead-code elimination pass.

sauclovian-g commented 2 months ago

Good to know, thanks. So I think we can go ahead here...