Open fvrmatteo opened 1 year ago
I think the idea of the 64 bit loads/stores was that I didn't have a reliable uint18_t type years ago, and I didn't want to risk it turning into an intrinsic function that took the address of a 128-bit value, and thus took the address of a state structure register, indirectly leading to state escape that could hinder optimization.
You could prototype an actual int18_t (or beyond) memory access intrinsic and see if you get better code.
Similarly, a very early decision in remill was that I wanted to make the code as easy to run in klee or custom tools, so I didn't want to have to support the myriad vector types, and hence used arrays for everything. This might be worth revisiting, e.g. using __attribute__((vector_size(N)))
.
I think the 64 bits load/store instructions are actually fine for now because they match the logic used by the rest of the semantics (e.g. LDP SIMD semantics) and are easier to work with KLEE/custom tools.
Although the AArch64 manual is specifically showing that this instruction would be executing two 128 bits load/store instructions, hence why I had a doubt. This would need to be a long process thats makes sure all the instructions (that work on SIMD registers and memory) use the properly sized memory accesses.
@fvrmatteo Can you investigate re-enabling this test (and possibly see if it's sane)?
https://github.com/lifting-bits/remill/blob/master/tests/AArch64/DATAXFER/STP_n_LDSTPAIR_OFF.S#L67
The test itself looks fine and I re-enabled it, although I cannot really run it on M1.
@fvrmatteo can you investigate the hacks that @tetsuo-cpp tried, and see if it's possible to get M1 support by removing the call frame information dwarf things, i.e. .cfi_startproc
et al., from the test macros?
@pgoodman I did some changes to the AArch64 tests, but I'm a bit wandering in the dark because I don't know how GTest works. The changes I did in the last commit enable the compilation of the AArch64 tests on MacOS ARM64, although when I run them manually I get the following error:
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from GoogleTestVerification
[ RUN ] GoogleTestVerification.UninstantiatedParameterizedTestSuite<InstrTest>
/Users/matteo/Documents/remill/tests/AArch64/Run.cpp:762: Failure
Parameterized test suite InstrTest is defined via TEST_P, but never instantiated. None of the test cases will run. Either no INSTANTIATE_TEST_SUITE_P is provided or the only ones provided expand to nothing.
Ideally, TEST_P definitions should only ever be included as part of binaries that intend to use them. (As opposed to, for example, being placed in a library that may be linked in to get other utilities.)
To suppress this error for this test suite, insert the following line (in a non-header) in the namespace it is defined in:
GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(InstrTest);
[ FAILED ] GoogleTestVerification.UninstantiatedParameterizedTestSuite<InstrTest> (0 ms)
[----------] 1 test from GoogleTestVerification (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] GoogleTestVerification.UninstantiatedParameterizedTestSuite<InstrTest>
It is unexpected because in the Run.cpp
file I can see the call to INSTANTIATE_TEST_SUITE_P
for InstrTest
, so I'm not sure why it isn't detected.
I also turned -O1
into -O0
temporarily because compiling Remill with LLVM-14 is going to fail the compilation due to the LoopLoadElimination
pass (and possibly others) not properly handling the opaque pointers; these issues are solved in LLVM-15, so we'll be able to use -O1
again.
@fvrmatteo Does Clang's assembler support the adrl
pseudo-op? Or maybe the @pageoff
macro can always include the :lo12:
in it?
Otherwise, I am not sure why that comes up with the tests. I will try to look into it.
@pgoodman As far as I see and from my tests, the adrl
pseudo operation is not supported by Clang and some migration guides are suggesting to define a macro for it when moving from GAS syntax for example. I included the :lo12:
into the SYMBOL_PAGEOFF
macro as requested.
Please let me know if you figure out the issue with GTest, in the meantime I'll try figuring out the same.
Please let me know if you figure out the issue with GTest, in the meantime I'll try figuring out the same.
Hey @fvrmatteo, I'm taking a look. I'll let you know what I find.
Ok, I have an idea of what's going on.
The reason this is happening is not because there is no INSTANTIATE_TEST_SUITE_P
macro (there is), it's because gTests
is empty. This data ultimately comes from __aarch64_test_table
defined in Tests.S
.
Curiously, compilation succeeds without an issue however, if you dump the contents of the Tests.S.o
object file, it shows that the __arch64_test_table
section is empty and that __aarch64_test_table_begin
is pointing at the same spot as __aarch64_test_table_end
.
I believe this is because the pre-processor macros that we use to populate the test table aren't portable across assemblers: more specifically, the GAS syntax uses ;
as a statement delimiter whereas the Clang ARM assembler uses it as the beginning of a comment. So effectively, the assembler on my M1 was treating everything after the first statement of this block (just a .section
directive) as part of a comment. That's why there's no error raised by the assembler but the test table is unpopulated. If I dump the pre-processed assembly file to disk and then manually replace the semicolons with newlines, I start to see the test table getting populated.
At the moment, I'm trying to figure out whether there is a character that isn't a newline that can be used as a statement delimiter in the Clang ARM assembler syntax or whether there is some .syntax
directive I can use to get around this.
@tetsuo-cpp In the past, I've copied what DynamoRIO does and used @N@
as a delimiter, then ran a post-processing script to fix them up 🤮 https://github.com/Granary/granary/blob/master/scripts/post_process_asm.py#L12
@tetsuo-cpp In the past, I've copied what DynamoRIO does and used
@N@
as a delimiter, then ran a post-processing script to fix them up 🤮 https://github.com/Granary/granary/blob/master/scripts/post_process_asm.py#L12
Haha it's ugly... but it works. Done in ae8f8c9.
Now I'm seeing a few test failures but at least the tests are running. Weird thing is that the number of tests failing is different each time so there's some flakiness going on. I don't have time to look at tonight though.
Had a quick look at this today but still not 100% sure. From what I can see, there are a two different types of mismatches that may or may not be related:
The first is a mismatch on system registers. Flags like IXC, OFC and UFC are seeing mismatches. This happens across multiple test cases.
[ RUN ] GeneralInstrTest/InstrTest.SemanticsMatchNative/fadd_s_floatdp2_2
E20221120 18:27:52.705013 8271463 Run.cpp:686] States did not match for fadd_s_floatdp2_2 with X0=3fffffff, X1=40000000 and N=0, Z=0, C=0, V=0
/Users/tetsuo/Code/remill/tests/AArch64/Run.cpp:687: Failure
Value of: !"Lifted and native states did not match."
Actual: false
Expected: true
/Users/tetsuo/Code/remill/tests/AArch64/Run.cpp:723: Failure
Expected equality of these values:
lifted_state->sr.ixc
Which is: '\0'
native_state->sr.ixc
Which is: '\x1' (1)
/Users/tetsuo/Code/remill/tests/AArch64/Run.cpp:725: Failure
Expected equality of these values:
lifted_state->sr.ufc
Which is: '\x1' (1)
native_state->sr.ufc
Which is: '\0'
E20221120 18:27:52.705075 8271463 Run.cpp:738] Bytes at offset 1145 are different
E20221120 18:27:52.705098 8271463 Run.cpp:738] Bytes at offset 1149 are different
E20221120 18:27:52.705149 8271463 Run.cpp:686] States did not match for fadd_s_floatdp2_2 with X0=3fffffff, X1=40000000 and N=0, Z=0, C=0, V=1
The other is just an inequality when comparing states byte for byte. So it's not clear what part of the state is different unless I sit down and count the size of each struct member.
[ RUN ] GeneralInstrTest/InstrTest.SemanticsMatchNative/add_w9_w0_fff_1
E20221120 18:27:51.766563 8271463 Run.cpp:686] States did not match for add_w9_w0_fff_1 with X0=0 and N=0, Z=0, C=0, V=0
/Users/tetsuo/Code/remill/tests/AArch64/Run.cpp:687: Failure
Value of: !"Lifted and native states did not match."
Actual: false
Expected: true
E20221120 18:27:51.766611 8271463 Run.cpp:738] Bytes at offset 1112 are different
[ FAILED ] GeneralInstrTest/InstrTest.SemanticsMatchNative/add_w9_w0_fff_1, where GetParam() = 0x104d54080 (3 ms)
The other part that's stumping me is the fact that the number of test failures isn't consistent. As far as I can tell, these tests should be deterministic but I haven't looked at every LOC so I don't know for sure. @pgoodman, do you know of any reason why the tests could be failing non-deterministically?
@pgoodman I'll take a look at this again when I get the chance. Did you have any ideas on why we might have non-determinism in these tests?
These look like floating point stuff? FPU stuff is typically a bit sketchy and might behave differently in different environments / rounding mode / etc. These one-byte differences look like fpu conditions, and so it's probably that the way we're computing the conditions is not quite right, or it appears right, but the actual non-determinism is due to the bitcode/machine code not being exactly like we need it to be. The way to track flags of fpu operations ends up being pretty brittle: it's something like:
1) reset flags
2) operation
3) read flags
And we rely on these being in-order, with no other stuff sneaking in. But maybe in the code there is:
1) reset flags (inst 1)
2) operation (inst 1)
3) read flags (inst 1)
4) reset flags (inst 2)
5) operation (inst 2)
6) read flags (inst 2)
And then the compiler (un)helpfully reorders as:
1) reset flags (inst 1)
2) operation (inst 1)
5) operation (inst 2)
3) read flags (inst 1)
4) reset flags (inst 2)
6) read flags (inst 2)
Hi!
I wanted to take a stab at adding support for the AArch64 STP_Q_LDSTPAIR_OFF semantics.
I prepared a preliminary patch, although there are some things that are not clear to me at the moment: