s5z / zsim

A fast and scalable x86-64 multicore simulator
GNU General Public License v2.0
337 stars 186 forks source link

zsim alters the program execution behavior. #273

Open dkadiyala3 opened 1 month ago

dkadiyala3 commented 1 month ago

Hi,

I am getting a different execution output with vs without using zsim for a spec benchmark. Although I understand that zsim will not try to change the program execution, I still couldn't understand why there is a difference with the program's execution output when I run on native machine vs inside zsim. Any suggestions or comments to workaround this are highly appreciated!!

I have been using the zsim to simulate the spec cpu 2017 workloads. Here is my configuration:

gcc, g++ 9.4.0
OS: ubuntu 18.04.6

I am launching the 508.namd_r benchmark using the following command:

// OOO-Core-0;
process0 = {
    startFastForwarded = true;
    syncedFastForward = "Never";
    mask = "3";
    ffiPoints = "200000000000 200000000";
    command = "./namd_r_base.spec_ref_single_thread-m64 --input apoa1.input --output apoa1.ref.output --iterations 65";
}

Now this is the output I got from the zSim simulation:

................
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm27, zmm4 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm26, zmm3 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm20, qword ptr [rax+0x70] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm27, zmm4 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm26, zmm3 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm20, zmm6 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovapd zmm21, zmm3 | loads=0 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm20, qword ptr [rsi+0x8] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm21, qword ptr [rsi+0x10] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm20, qword ptr [rsi+0x8] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm21, qword ptr [rsi+0x10] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm22, qword ptr [r13+0x50] | loads=1 stores=0 inRegs=0  outRegs=0 
[S 0] WARN: Unhandled case: emitBasicMove | vmovsd xmm22, qword ptr [r13+0x50] | loads=1 stores=0 inRegs=0  outRegs=0 
*** TEST RUN - 65 ITERATIONS ***
writing to output file apoa1.ref.output
iteration 0: 1 0 0 1
iteration 0: 1 1 0 0
iteration 0: 1 1 1 0
iteration 0: 0 0 0 0
iteration 0: 0 1 0 0
iteration 0: 0 1 1 0
error: numeric test failed! (error = 21.4827)
[S 0] Shadow/NOP thread 0 finished
[S 0] Finished, code -10
[S 0] Dumping termination stats
[S 0] Finished scheduler watchdog thread
[H] Child 79449 done
[H]  [executable/opt/zsim_harness.cpp] Waking up, secs elapsed 146
[H] DEBUG in main() at executable/opt/zsim_harness.cpp:748:  nicInfo->nic_egress_proc_on: 0
[H] DEBUG in main() at executable/opt/zsim_harness.cpp:749:  nicInfo->nic_ingress_proc_on: 0
[H] DEBUG in main() at executable/opt/zsim_harness.cpp:750:  nicInfo->nic_init_done: 0
[H] DEBUG in main() at executable/opt/zsim_harness.cpp:751:  #reg_net_core 0, #exp_net_core 0, #reg_non_net_core 0, #exp_non_net_core 0
[H] Active procs running: 0
[H] ----------------------------------------------------------------------
sim elapsed time: 1.72938e+09s
Dumping IR_SR and memory stats here
[H] sampling phase count: 0
[H] All children done, exiting

However, when I run the same exact application on real machine I get the following output:

...........
iteration 61: 0 1 1 1
iteration 62: 1 0 0 0
iteration 62: 1 1 0 0
iteration 62: 1 1 1 0
iteration 62: 0 0 0 0
iteration 62: 0 1 0 0
iteration 62: 0 1 1 0
iteration 63: 1 0 0 1
iteration 63: 1 1 0 0
iteration 63: 1 1 1 0
iteration 63: 0 0 0 0
iteration 63: 0 1 0 0
iteration 63: 0 1 1 0
iteration 64: 1 0 0 0
iteration 64: 1 1 0 1
iteration 64: 1 1 1 0
iteration 64: 0 0 0 0
iteration 64: 0 1 0 0
iteration 64: 0 1 1 0
*** TEST RUN - 65 ITERATIONS ***
SUCCESSFUL COMPLETION