OpenXiangShan / Deterload

Xiangshan deterministic workloads generator
https://openxiangshan.github.io/Deterload/
Mulan Permissive Software License, Version 2
8 stars 1 forks source link

nemu is not deterministic #8

Open xieby1 opened 1 week ago

xieby1 commented 1 week ago

The generated checkpoints using nemu are.not.deterministic. In other words, if generate checkpoints twice, the md5sum of first and second generated checkpoints are different.

eastonman commented 3 days ago

Recently I encountered a serious problem with NEMU. Running workloads twice on the same NEMU results in different instruction counts, which causes some workloads finish before the last checkpoint start.

[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384600000000
[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384660000000
[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384680000000
[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384700000000
[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384720000000
[src/checkpoint/serializer.cpp:328,instrsCouldTakeCpt] First cpt @ 2427740000000, now: 2384740000000
[/build/source/src/isa/riscv64/include/../instr/special.h:38,execute] nemu_trap case 0
[src/cpu/cpu-exec.c:734,cpu_exec] nemu: HIT GOOD TRAP at pc = 0x0000000000010522
[src/cpu/cpu-exec.c:740,cpu_exec] trap code:0
[src/cpu/cpu-exec.c:94,monitor_statistic] host time spent = 25768154605 us
[src/cpu/cpu-exec.c:96,monitor_statistic] total guest instructions = 2384825716646
[src/cpu/cpu-exec.c:97,monitor_statistic] vst count = 407850, vst unit count = 407850, vst unit optimized count = 0
[src/cpu/cpu-exec.c:100,monitor_statistic] simulation frequency = 92549340 instr/s
[src/utils/state.c:30,is_exit_status_bad] NEMU exit with good state: 2, halt ret: 0

This is VERY SEVERE. If the instruction count is not deterministic, the interval provided by SimPoint is INVALID.