ucb-bar / gemmini

Berkeley's Spatial Array Generator
Other
808 stars 169 forks source link

Firesim running transformer hangs in Q * K #344

Closed shirohasuki closed 7 months ago

shirohasuki commented 7 months ago

I am reaching out regarding an issue I am facing while running the Transformer test on FireSim. The simulation hangs midway, and the VTrace stops updating (possibly due to reaching the recording limit), although the heartbeat continues to update. I have inserted printf statements before and after the for loop, and the one before the loop is effective, but the printf after the loop never outputs.

I am running the test on the au280 platform, and after waiting for approximately a night, I suspect that the execution might have stopped. I would greatly appreciate it if you could provide some guidance or references to help me diagnose and resolve this issue.

    gemmini_fence();
    printf("here3\n");

    // attn = Q * K
    // attn = softmax(attn)
    for (int head = 0; head < num_heads; head++) {
        const elem_t * A = Q_buf + head * hidden_dim_per_head;
        const elem_t * B = K_buf + head * hidden_dim_per_head;
        elem_t * C = attn_buf + head * seq_len * seq_len;

        tiled_matmul_auto(seq_len, seq_len, hidden_dim_per_head,
            /*A=*/ A, /*B=*/ B,
            /*D=*/ NULL, /*C=*/ C,
            /*stride_A=*/hidden_dim, /*stride_B=*/hidden_dim, /*stride_D=*/0, /*stride_C=*/seq_len,
            MVIN_SCALE_IDENTITY, MVIN_SCALE_IDENTITY, MVIN_SCALE_IDENTITY,
            SOFTMAX, /*scale=*/ ACC_SCALE_IDENTITY, /*bert_scale=*/ 0,
            /*repeating_bias=*/ false,
            false, /*transpose_B=*/ true,
            false, false,
            0,
            WS);
    }

    gemmini_fence();
    printf("here4\n");

"here3" can be printed, but "here4" not.

shirohasuki commented 7 months ago

I think it may be because I did not modify the configuration in TargetConfigs. scala and RoCCAcceleratorConfigs. scala to the set CustomiCofig (which sets customConfiguration=ibertInferenceConfig). I am trying to recompile the bitstream.