taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.54k stars 2.29k forks source link

What does runtime_retrieve_and_reset_error_code mean? #8558

Open Yihao-Shi opened 4 months ago

Yihao-Shi commented 4 months ago

When I print the kernel profile after running my code, I notice a strange kernel named runtime_retrieve_and_reset_error_code in my profile.

[ 84.93%   0.069 s      1x |   68.741    68.741    68.741 ms] kernel_place_particles__c604_0_kernel_0_serial
[  3.28%   0.003 s     20x |    0.115     0.133     0.319 ms] grid_reset_c332_0_kernel_0_range_for
[  1.89%   0.002 s      1x |    1.529     1.529     1.529 ms] kernel_add_body__c608_0_kernel_1_range_for
[  1.53%   0.001 s      1x |    1.241     1.241     1.241 ms] runtime_initialize
[  1.14%   0.001 s      3x |    0.280     0.306     0.358 ms] matrix_to_ext_arr_c28_6_kernel_0_range_for

[  0.87%   0.001 s     97x |    0.003     0.007     0.205 ms] runtime_retrieve_and_reset_error_code

[  0.77%   0.001 s      3x |    0.178     0.208     0.268 ms] matrix_to_ext_arr_c28_3_kernel_0_range_for
[  0.66%   0.001 s      3x |    0.177     0.178     0.179 ms] matrix_to_ext_arr_c28_4_kernel_0_range_for
[  0.47%   0.000 s      1x |    0.383     0.383     0.383 ms] compute_elastic_stiffness_matrix_c232_0_kernel_1_range_for
[  0.42%   0.000 s     11x |    0.005     0.031     0.095 ms] runtime_memory_allocate_aligned
[  0.34%   0.000 s      3x |    0.090     0.091     0.093 ms] matrix_to_ext_arr_c28_5_kernel_0_range_for
[  0.34%   0.000 s      3x |    0.090     0.091     0.091 ms] matrix_to_ext_arr_c28_2_kernel_0_range_for
[  0.34%   0.000 s      3x |    0.089     0.090     0.091 ms] matrix_to_ext_arr_c28_0_kernel_0_range_for
[  0.33%   0.000 s      3x |    0.090     0.090     0.091 ms] matrix_to_ext_arr_c28_1_kernel_0_range_for
[  0.32%   0.000 s      1x |    0.261     0.261     0.261 ms] global_update_c158_0_kernel_1_range_for
[  0.27%   0.000 s      3x |    0.071     0.073     0.076 ms] matrix_to_ext_arr_c28_7_kernel_0_range_for
[  0.20%   0.000 s      1x |    0.165     0.165     0.165 ms] kernel_mass_p2g_c350_0_kernel_1_range_for
[  0.18%   0.000 s      1x |    0.147     0.147     0.147 ms] runtime_initialize_rand_states_cuda
[  0.17%   0.000 s      1x |    0.139     0.139     0.139 ms] grid_mass_reset_c336_0_kernel_0_range_for
[  0.16%   0.000 s      1x |    0.130     0.130     0.130 ms] estimate_active_dofs_c178_0_kernel_1_range_for
[  0.15%   0.000 s      1x |    0.119     0.119     0.119 ms] kernel_apply_vigot_stress__c594_0_kernel_1_range_for
[  0.13%   0.000 s      3x |    0.032     0.034     0.037 ms] tensor_to_ext_arr_c6_3_kernel_0_range_for
[  0.12%   0.000 s      3x |    0.030     0.033     0.035 ms] tensor_to_ext_arr_c6_4_kernel_0_range_for
[  0.11%   0.000 s      3x |    0.028     0.031     0.036 ms] tensor_to_ext_arr_c6_0_kernel_0_range_for
[  0.11%   0.000 s      3x |    0.028     0.031     0.036 ms] tensor_to_ext_arr_c6_2_kernel_0_range_for
[  0.11%   0.000 s      3x |    0.029     0.031     0.033 ms] tensor_to_ext_arr_c6_1_kernel_0_range_for
[  0.11%   0.000 s      1x |    0.091     0.091     0.091 ms] snode_writer_2_kernel_0_serial
[  0.07%   0.000 s      3x |    0.019     0.020     0.022 ms] tensor_to_ext_arr_c6_5_kernel_0_range_for
[  0.06%   0.000 s      6x |    0.005     0.008     0.011 ms] snode_reader_68_kernel_0_serial
[  0.05%   0.000 s      1x |    0.044     0.044     0.044 ms] kernel_initial_state_variables_c228_0_kernel_1_range_for
[  0.03%   0.000 s      6x |    0.003     0.004     0.006 ms] runtime_initialize_snodes
[  0.03%   0.000 s      3x |    0.006     0.007     0.009 ms] snode_reader_69_kernel_0_serial
[  0.03%   0.000 s      1x |    0.020     0.020     0.020 ms] set_displacement_contraint_c292_0_kernel_1_range_for
[  0.02%   0.000 s      3x |    0.005     0.007     0.009 ms] snode_reader_70_kernel_0_serial
[  0.02%   0.000 s      3x |    0.004     0.006     0.008 ms] snode_reader_153_kernel_0_serial
[  0.02%   0.000 s      1x |    0.017     0.017     0.017 ms] kernel_initialize_boundary_c270_0_kernel_0_range_for
[  0.02%   0.000 s      1x |    0.013     0.013     0.013 ms] global_update_c158_0_kernel_0_serial
[  0.02%   0.000 s      1x |    0.013     0.013     0.013 ms] snode_reader_124_kernel_0_serial
[  0.01%   0.000 s      1x |    0.010     0.010     0.010 ms] kernel_mass_p2g_c350_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.010     0.010     0.010 ms] snode_writer_69_kernel_0_serial
[  0.01%   0.000 s      1x |    0.010     0.010     0.010 ms] kernel_initial_state_variables_c228_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] kernel_apply_vigot_stress__c594_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] field_fill_python_scope_c40_0_kernel_0_range_for
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] snode_writer_70_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] snode_writer_68_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] snode_reader_2_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] snode_writer_71_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] snode_writer_72_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] kernel_add_body__c608_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.009     0.009     0.009 ms] compute_elastic_stiffness_matrix_c232_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.008     0.008     0.008 ms] runtime_initialize_runtime_context_buffer
[  0.01%   0.000 s      1x |    0.008     0.008     0.008 ms] estimate_active_dofs_c178_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.005     0.005     0.005 ms] set_displacement_contraint_c292_0_kernel_0_serial
[  0.01%   0.000 s      1x |    0.004     0.004     0.004 ms] set_displacement_contraint_c292_0_kernel_2_serial
[  0.01%   0.000 s      1x |    0.004     0.004     0.004 ms] estimate_active_dofs_c178_0_kernel_2_serial

Is that means there are some errors in my kernel? If so, how can I fix it? Thanks!