mec-UMN / HISIM

MIT License
12 stars 3 forks source link

PPA.csv Questions #4

Closed jzhou1318 closed 1 month ago

jzhou1318 commented 1 month ago

Opening a new issue to further discuss some issues in https://github.com/mec-UMN/HISIM/issues/3#issue-2398492680

  1. I will probably change the PPA logging system from a list to a dictionary for ease of use. (edit: I implemented this in a fork https://github.com/jzhou1318/HISIM)
  2. I plotted some preliminary results from toy examples I've been playing with. I sweep the number of PEs from 1 to 512 and chip architecture (M3D, M2D, H2_5D).
    1. Why is computing latency flat? (bottom left, Computing_latency (ns) in header)
      1. Why is a large chunk of 2D architecture network latency 3D NoC latency? (bottom right, 2d NoC latency (ns), 3d NoC latency (ns), 2.5d NoC latency (ns) in header)
Screenshot 2024-07-10 at 12 30 41 PM

I've also uploaded the number PE + chip architecture combinations I'm running: extracted_PPA.csv

and the other configurations as as shown below:

hisim = HiSimModel( chip_architect = "M3D", xbar_size = 1024, N_tile = 100, N_pe = 9, N_tier = 3, freq_computing = 1, fclk_noc = 1, placement_method = 5, router_times_scale = 1, percent_router = 1, tsv_pitch = 5, W2d = 32, ai_model = 'vit', thermal = False )

pragnyan948 commented 1 month ago
  1. dictionary is okay
  2. i. Compute latency is just the end-to-end cumulative latency of tiles and this doesn't change with 2D/3D architecture. you will observe network latency is affected though. ii. For 2D configuration, You also need to change Ntier to 1 and placement method to 1. Code should ideally be taking of this redundancy. We have fixed this in the upcoming release. Additionally, you will notice that even with the above settings the Ntier_real is 2 in PPA.csv instead of 1. This is because the mapping method can't map it onto a 2D for PE size<512 and so it maps it as 3D. Increase the PE size further until you obtain Ntier_real=1

We will make sure in an upcoming release, that the cases where the above happens are reported as error.

PPA.csv confusion from #3

  1. Comments on units of the outputs: This will be added soon in the release planned next week.
  2. We will incorporate NaN as suggested
  3. leakage power is currently only logged in COMPUTE_VALIDATE which is used to validate the tool. If any outputs are required to be logged to integrate HISIM with ArchGym. Please let them know. I will log them in the release planned for next week.
jzhou1318 commented 1 month ago
  1. i. Why does changing the number of PEs not affect end-to-end cumulative latency of tiles?

PPA.csv Confusion

  1. There aren't any specific outputs that ArchGym would require. Rather, I was just wondering why certain outputs weren't logged as someone using the simulator may want that information. ArchGym is simply taking in the inputs and outputs made available by HiSim and orchestrating targetted design space exploration through them.
pragnyan948 commented 1 month ago

Thank you for your reply! Here is a generic answer to the understanding of why several PEs inside a tile don't change the compute latency inside HISIM:

I hope I was able to answer your question. Also, It would be better to type in your hypothesis for a behavior next time so that I can better answer your question in that regard. Leakage power will be placed as an output in the next release version.

jzhou1318 commented 1 month ago

Regarding chip_architecture, N_tier, and N_tier_real, is the logic below correct?

if chip_architect == "H2_5D" :
    self.placement_method = 1
elif chip_architect == "M2D":
    self.placement_method = 1
    self.N_tier = 1
else:
    self.placement_method = placement_method
    self.N_tier = N_tier

if self.chip_architect == 'M2D' and self.N_tier_real != 1:
    return "error message"
pragnyan948 commented 1 month ago

I would suggest instead to print N_tier, N_tile, N_stack and compare them with N_tier_real, N_tile_real, N_stack_real in post processing and throw an error accordingly wrt to chip architecture.

jzhou1318 commented 1 month ago

For each architecture, could you elaborate on the valid configurations of placement, N_tier_real, N_tile_real, and N_stack_real?

pragnyan948 commented 1 month ago
image