accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
306 stars 117 forks source link

use pipe instead of cin and c++ version detection fix #340

Closed JRPan closed 1 month ago

JRPan commented 1 month ago

Because the std::cin is unique, when printing out the trace to cin, the cin must be drained first before the next kernel is printed to the cin i.e. cin is serialized.

Currently, the trace parser first parses the first several lines, looking for kernel info. Then, the rest of the traces are loaded TB by TB (get_next_threadblock_traces).

This works fine when only 1 kernel is running. The cin looks like:

kernel_1 name
kernel_1 id
.... <-----------------------------break here after parsing kernel info
#TB1 traces <----------------------`get_next_threadblock_traces` continues here
...
#TB2 traces
...more

But when there is 2+ kernels, for example, multi-stream concurrent, the kernel 2 info is after the kernel 1 traces in the cin buffer. The parser cannot parse kernel 2 info before kernel1 traces are drained. But kernel 2 must be able to be parsed at any time.

To fix this, we removed cin, fstream, and load from pipe directly. Each kernel has it's own pipe.

Also, Ubuntu 22.04 comes with C++ version 11, the old version detection only looks for 1 digit. This is addressed as well.

JRPan commented 1 month ago

I have no idea why the CI is failing. It runs perfectly fine on the raid.

57690       tgrogers-bigram-01              backprop-rodinia-2.0    4096___data_result_4    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=282 K   SIM_TIME=33 sec (33 sec)    TOT_IPC=626 TOT_INSN=9 M    TOT_CYCLE=15 K  
57691       tgrogers-bigram-01              bfs-rodinia-2.0-ft      __data_graph4096_txt    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=22 K    SIM_TIME=54 sec (54 sec)    TOT_IPC=10  TOT_INSN=1 M    TOT_CYCLE=122 K 
57692       tgrogers-bigram-01              hotspot-rodinia-2.0-    30_6_40___data_resul    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=607 K   SIM_TIME=51 sec (51 sec)    TOT_IPC=569 TOT_INSN=31 M   TOT_CYCLE=54 K  
57693       tgrogers-bigram-01              heartwall-rodinia-2.    __data_test_avi_1___    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=428 K   SIM_TIME=16 sec (16 sec)    TOT_IPC=734 TOT_INSN=7 M    TOT_CYCLE=9 K   
57694       tgrogers-bigram-01              lud-rodinia-2.0-ft      _v__b__i___data_64_d    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=7 K SIM_TIME=1 min, 17 sec (77 sec) TOT_IPC=3   TOT_INSN=554 K  TOT_CYCLE=169 K 
57695       tgrogers-bigram-01              nw-rodinia-2.0-ft       128_10___data_result    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=8 K SIM_TIME=1 min, 10 sec (70 sec) TOT_IPC=4   TOT_INSN=572 K  TOT_CYCLE=138 K 
57696       tgrogers-bigram-01              nn-rodinia-2.0-ft       __data_filelist_4_3_    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=131 K   SIM_TIME=50 sec (50 sec)    TOT_IPC=225 TOT_INSN=7 M    TOT_CYCLE=29 K  
57697       tgrogers-bigram-01              pathfinder-rodinia-2    1000_20_5___data_res    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=56 K    SIM_TIME=16 sec (16 sec)    TOT_IPC=29  TOT_INSN=888 K  TOT_CYCLE=31 K  
57698       tgrogers-bigram-01              srad_v2-rodinia-2.0-    __data_matrix128x128    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=255 K   SIM_TIME=37 sec (37 sec)    TOT_IPC=317 TOT_INSN=9 M    TOT_CYCLE=30 K  
57699       tgrogers-bigram-01              streamcluster-rodini    3_6_16_1024_1024_100    accelsim-commit-d595    QV100-SASS  UNKNOWN         UNKNOWN COMPLETE_NO_OTHER_INFO          SIMRATE_IPS=52 K    SIM_TIME=5 min, 51 sec (351 sec)    TOT_IPC=16  TOT_INSN=18 M   TOT_CYCLE=1 M
JRPan commented 1 month ago

I tested compressed .xz as well.

FJShen commented 1 month ago

I just realized, after adding this PipeReader class we should probably take extra care when copying/moving the PipeReader or kernel_trace_t objects. The code in its current form has no guardrails that prevent somebody from copying a PipeReader and thus duplicating the pipe handle. I can open an issue and work on a PR if necessary.