PaddlePaddle / PaddleCustomDevice

PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
Apache License 2.0
70 stars 148 forks source link

[GCU] Support LLM #1234

Closed EnflameGCU closed 5 months ago

EnflameGCU commented 5 months ago

Support LLM for GCU

paddle-bot[bot] commented 5 months ago

Thanks for your contribution!

EnflameGCU commented 5 months ago
Test project /home/***/PaddleCustomDevice/backends/gcu/build
      Start  1: test_arange
 1/48 Test  #1: test_arange ......................   Passed    1.73 sec
      Start  2: test_argmax
 2/48 Test  #2: test_argmax ......................   Passed    1.72 sec
      Start  3: test_argsort
 3/48 Test  #3: test_argsort .....................   Passed    3.84 sec
      Start  4: test_assign
 4/48 Test  #4: test_assign ......................   Passed    1.69 sec
      Start  5: test_binary_ops
 5/48 Test  #5: test_binary_ops ..................   Passed    2.29 sec
      Start  6: test_cast
 6/48 Test  #6: test_cast ........................   Passed    1.77 sec
      Start  7: test_concat
 7/48 Test  #7: test_concat ......................   Passed    1.89 sec
      Start  8: test_cumsum
 8/48 Test  #8: test_cumsum ......................   Passed    1.74 sec
      Start  9: test_dropout
 9/48 Test  #9: test_dropout .....................   Passed    1.69 sec
      Start 10: test_einsum
10/48 Test #10: test_einsum ......................   Passed    1.77 sec
      Start 11: test_embedding
11/48 Test #11: test_embedding ...................   Passed    1.71 sec
      Start 12: test_expand
12/48 Test #12: test_expand ......................   Passed    1.75 sec
      Start 13: test_flatten
13/48 Test #13: test_flatten .....................   Passed    1.68 sec
      Start 14: test_full
14/48 Test #14: test_full ........................   Passed    1.77 sec
      Start 15: test_fused_add_rms_norm
15/48 Test #15: test_fused_add_rms_norm ..........   Passed    3.19 sec
      Start 16: test_fused_rotary_embedding
16/48 Test #16: test_fused_rotary_embedding ......   Passed    7.36 sec
      Start 17: test_fused_sdp_flash_attention
17/48 Test #17: test_fused_sdp_flash_attention ...   Passed    8.85 sec
      Start 18: test_gather_nd
18/48 Test #18: test_gather_nd ...................   Passed    3.74 sec
      Start 19: test_gather_op
19/48 Test #19: test_gather_op ...................   Passed    1.71 sec
      Start 20: test_gaussian_random
20/48 Test #20: test_gaussian_random .............   Passed    1.68 sec
      Start 21: test_index_put
21/48 Test #21: test_index_put ...................   Passed    1.75 sec
      Start 22: test_index_sample
22/48 Test #22: test_index_sample ................   Passed    1.75 sec
      Start 23: test_interpolate
23/48 Test #23: test_interpolate .................   Passed    1.70 sec
      Start 24: test_log_softmax
24/48 Test #24: test_log_softmax .................   Passed    1.72 sec
      Start 25: test_matmul
25/48 Test #25: test_matmul ......................   Passed    8.61 sec
      Start 26: test_multinomial
26/48 Test #26: test_multinomial .................   Passed    1.80 sec
      Start 27: test_reduce_ops
27/48 Test #27: test_reduce_ops ..................   Passed    1.84 sec
      Start 28: test_rms_norm
28/48 Test #28: test_rms_norm ....................   Passed    2.49 sec
      Start 29: test_scale
29/48 Test #29: test_scale .......................   Passed    1.83 sec
      Start 30: test_scatter
30/48 Test #30: test_scatter .....................   Passed    3.76 sec
      Start 31: test_set_value
31/48 Test #31: test_set_value ...................   Passed    4.74 sec
      Start 32: test_slice
32/48 Test #32: test_slice .......................   Passed    1.74 sec
      Start 33: test_softmax
33/48 Test #33: test_softmax .....................   Passed    1.71 sec
      Start 34: test_split
34/48 Test #34: test_split .......................   Passed    7.91 sec
      Start 35: test_squeeze
35/48 Test #35: test_squeeze .....................   Passed    1.68 sec
      Start 36: test_stack
36/48 Test #36: test_stack .......................   Passed    1.82 sec
      Start 37: test_swiglu
37/48 Test #37: test_swiglu ......................   Passed    6.78 sec
      Start 38: test_tile
38/48 Test #38: test_tile ........................   Passed    1.73 sec
      Start 39: test_topk
39/48 Test #39: test_topk ........................   Passed    2.42 sec
      Start 40: test_transpose
40/48 Test #40: test_transpose ...................   Passed    3.69 sec
      Start 41: test_unary_ops
41/48 Test #41: test_unary_ops ...................   Passed    1.87 sec
      Start 42: test_uniform_random
42/48 Test #42: test_uniform_random ..............   Passed    1.69 sec
      Start 43: test_unsqueeze
43/48 Test #43: test_unsqueeze ...................   Passed    1.69 sec
      Start 44: test_where
44/48 Test #44: test_where .......................   Passed    2.22 sec
      Start 45: test_conv_bn_hard_swish_pass
45/48 Test #45: test_conv_bn_hard_swish_pass .....   Passed   10.14 sec
      Start 46: test_conv_bn_pass
46/48 Test #46: test_conv_bn_pass ................   Passed    8.84 sec
      Start 47: test_conv_bn_relu_pass
47/48 Test #47: test_conv_bn_relu_pass ...........   Passed   10.04 sec
      Start 48: test_custom_pass_gcu
48/48 Test #48: test_custom_pass_gcu .............   Passed    4.86 sec

100% tests passed, 0 tests failed out of 48

Total Test time (real) = 157.19 sec
qili93 commented 5 months ago

Thanks for your contribution!