CMU-SAFARI / ramulator2

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM standards, emerging RowHammer mitigation techniques). Described in our paper https://people.inf.ethz.ch/omutlu/pub/Ramulator2_arxiv23.pdf
https://arxiv.org/abs/2308.11030
MIT License
251 stars 62 forks source link

Inquiries about HBM2/3 configuration #42

Open Mo-guan opened 7 months ago

Mo-guan commented 7 months ago

Hi! I am using ramulator2 for DRAM simulation. When I tested the HBM, I noticed that the bandwidth seemed to be lower than expected. For 5,000,000 requests generated in perf_comparison, the HBM memory system takes 5390135 cycles to process the requests, which translates to a bandwidth of 5e6 * 64 / 2**30 / (5390135 / 1e9) = 55GB/s. I'm wondering if there's a problem with the configuration of HBM in my yaml file, or if there's a problem with the understanding elsewhere. Thank you very much for your help.

ramulatorv2.yaml:

Frontend:
  impl: LoadStoreTrace
  path: ./traces/stream_5M_R8W2_ramulatorv2.trace
  clock_ratio: 10

  Translation:
    impl: NoTranslation
    max_addr: 2147483648

MemorySystem:
  impl: GenericDRAM
  clock_ratio: 1
  DRAM:
    impl: HBM3
    org:
      preset: HBM3_4Gb
      channel: 8
    timing:
      preset: HBM3_2Gbps

  Controller:
    impl: Generic
    Scheduler:
      impl: FRFCFS
    RefreshManager:
      impl: AllBank
    plugins:

  AddrMapper:
    impl: RoBaRaCoCh

output:

[Ramulator::LoadStoreTrace] [info] Loaded 5000000 lines.
Frontend:
  impl: LoadStoreTrace

MemorySystem:
  impl: GenericDRAM
  total_num_other_requests: 0
  total_num_write_requests: 1000170
  total_num_read_requests: 3999830
  memory_system_cycles: 5390135
  DRAM:
    impl: HBM3
  AddrMapper:
    impl: RoBaRaCoCh
  Controller:
    impl: Generic