gpgpu-sim / gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
Other
1.13k stars 511 forks source link

microarchitecture model bug, ldst pop m_accessq banking error #268

Open Auyuir opened 2 years ago

Auyuir commented 2 years ago

shader.cc line 1875(around), in ldst_unit::process_memory_access_queue_l1cache.

when poping out from inst.m_accessq, a function named m_config->m_L1D_config.set_bank is used.

Assume sector size to be 32B, 4 sectors a line, I suppose the variable bank_id is going to slice out the 6-7 bits(from right to left) from the total 32 bits addr, as descripted in setbank function(its subcall) gpu-cache.cc line 133:

set_index = (addr >> m_line_sz_log2) & (m_nset - 1);

so the variable m_line_sz_log2 should be the log2 of 32 = 5, but in a real execution this value is 32 instead, causing the right shift always make the set_index to be 0, which in turn cause the bank conflict detection mechanism always result in BK_CONF.

Here is my GDB execution result:

(gdb) n
1875          unsigned bank_id = m_config->m_L1D_config.set_bank(mf->get_addr());
(gdb) s
mem_fetch::get_addr (this=0x7ffff08b0750) at mem_fetch.h:89
89        new_addr_type get_addr() const { return m_access.get_addr(); }
(gdb) fin
Run till exit from #0  mem_fetch::get_addr (this=0x7ffff08b0750) at mem_fetch.h:89
0x00007ffff7c0053c in ldst_unit::process_memory_access_queue_l1cache (this=0x55555644e380, cache=0x5555564eb380, inst=...) at shader.cc:1875
1875          unsigned bank_id = m_config->m_L1D_config.set_bank(mf->get_addr());
Value returned is $8 = 3221225920
(gdb) s
l1d_cache_config::set_bank (this=0x555555575860, addr=3221225920) at gpu-cache.cc:66
66        return cache_config::hash_function(addr, l1_banks, l1_banks_byte_interleaving,//目前interleaving值均为32B
(gdb) n
68                                           l1_banks_hashing_function);
(gdb) s
67                                           m_l1_banks_log2,
(gdb) s
66        return cache_config::hash_function(addr, l1_banks, l1_banks_byte_interleaving,//目前interleaving值均为32B
(gdb) s
68                                           l1_banks_hashing_function);
(gdb) s
cache_config::hash_function (this=0x555555575860, addr=3221225920, m_nset=4, m_line_sz_log2=32, m_nset_log2=2, m_index_function=0) at gpu-cache.cc:80
80        unsigned set_index = 0;
(gdb) n
82        switch (m_index_function) {
(gdb) n
133           set_index = (addr >> m_line_sz_log2) & (m_nset - 1);
(gdb) n
134           break;
(gdb) p set_index
$9 = 0

value 3221225920 is 11000000000000000000000111100000 in binary , its 6-7 bits it 11, not the shown result 00