Closed changqi1 closed 2 months ago
$ ./ut/layers_attention_test [==========] Running 2 tests from 1 test case. [----------] Global test environment set-up. [----------] 2 tests from AttentionLLaMA [ RUN ] AttentionLLaMA.bfloat16_t >> create context: 4096 128 >> create llama_attention_key: 0x7f8fe6804040_0x7f8fe6808040_0x7f8fe6808040_0x7f8fe2803040_1_128_32_32 [ RUNTIME ] XFT::invokeAttentionLLaMA 0.153258 sec [ RUNTIME ] XFT::invokeAttentionLLaMA 0.004665 sec [ RUNTIME ] XFT::invokeAttentionLLaMA 0.001864 sec [ OK ] AttentionLLaMA.bfloat16_t (755 ms) [ RUN ] AttentionLLaMA.float16_t >> create context: 4096 128 >> create llama_attention_key: 0x7f8fe6804040_0x7f8fe6808040_0x7f8fe6808040_0x7f8fe2803040_2_128_32_32 [ RUNTIME ] XFT::invokeAttentionLLaMA 0.119574 sec [ RUNTIME ] XFT::invokeAttentionLLaMA 0.046373 sec [ RUNTIME ] XFT::invokeAttentionLLaMA 0.039210 sec [ OK ] AttentionLLaMA.float16_t (1601 ms) [----------] 2 tests from AttentionLLaMA (2356 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test case ran. (2357 ms total) [ PASSED ] 2 tests.
@pujiang2018 @Duyi-Wang Done. Attention APIs don't have the LayerNorm function. The LayerNorm function will add in Decoder Layer API.