Open Yihao-Shi opened 4 months ago
When I print the kernel profile after running my code, I notice a strange kernel named runtime_retrieve_and_reset_error_code in my profile.
[ 84.93% 0.069 s 1x | 68.741 68.741 68.741 ms] kernel_place_particles__c604_0_kernel_0_serial [ 3.28% 0.003 s 20x | 0.115 0.133 0.319 ms] grid_reset_c332_0_kernel_0_range_for [ 1.89% 0.002 s 1x | 1.529 1.529 1.529 ms] kernel_add_body__c608_0_kernel_1_range_for [ 1.53% 0.001 s 1x | 1.241 1.241 1.241 ms] runtime_initialize [ 1.14% 0.001 s 3x | 0.280 0.306 0.358 ms] matrix_to_ext_arr_c28_6_kernel_0_range_for [ 0.87% 0.001 s 97x | 0.003 0.007 0.205 ms] runtime_retrieve_and_reset_error_code [ 0.77% 0.001 s 3x | 0.178 0.208 0.268 ms] matrix_to_ext_arr_c28_3_kernel_0_range_for [ 0.66% 0.001 s 3x | 0.177 0.178 0.179 ms] matrix_to_ext_arr_c28_4_kernel_0_range_for [ 0.47% 0.000 s 1x | 0.383 0.383 0.383 ms] compute_elastic_stiffness_matrix_c232_0_kernel_1_range_for [ 0.42% 0.000 s 11x | 0.005 0.031 0.095 ms] runtime_memory_allocate_aligned [ 0.34% 0.000 s 3x | 0.090 0.091 0.093 ms] matrix_to_ext_arr_c28_5_kernel_0_range_for [ 0.34% 0.000 s 3x | 0.090 0.091 0.091 ms] matrix_to_ext_arr_c28_2_kernel_0_range_for [ 0.34% 0.000 s 3x | 0.089 0.090 0.091 ms] matrix_to_ext_arr_c28_0_kernel_0_range_for [ 0.33% 0.000 s 3x | 0.090 0.090 0.091 ms] matrix_to_ext_arr_c28_1_kernel_0_range_for [ 0.32% 0.000 s 1x | 0.261 0.261 0.261 ms] global_update_c158_0_kernel_1_range_for [ 0.27% 0.000 s 3x | 0.071 0.073 0.076 ms] matrix_to_ext_arr_c28_7_kernel_0_range_for [ 0.20% 0.000 s 1x | 0.165 0.165 0.165 ms] kernel_mass_p2g_c350_0_kernel_1_range_for [ 0.18% 0.000 s 1x | 0.147 0.147 0.147 ms] runtime_initialize_rand_states_cuda [ 0.17% 0.000 s 1x | 0.139 0.139 0.139 ms] grid_mass_reset_c336_0_kernel_0_range_for [ 0.16% 0.000 s 1x | 0.130 0.130 0.130 ms] estimate_active_dofs_c178_0_kernel_1_range_for [ 0.15% 0.000 s 1x | 0.119 0.119 0.119 ms] kernel_apply_vigot_stress__c594_0_kernel_1_range_for [ 0.13% 0.000 s 3x | 0.032 0.034 0.037 ms] tensor_to_ext_arr_c6_3_kernel_0_range_for [ 0.12% 0.000 s 3x | 0.030 0.033 0.035 ms] tensor_to_ext_arr_c6_4_kernel_0_range_for [ 0.11% 0.000 s 3x | 0.028 0.031 0.036 ms] tensor_to_ext_arr_c6_0_kernel_0_range_for [ 0.11% 0.000 s 3x | 0.028 0.031 0.036 ms] tensor_to_ext_arr_c6_2_kernel_0_range_for [ 0.11% 0.000 s 3x | 0.029 0.031 0.033 ms] tensor_to_ext_arr_c6_1_kernel_0_range_for [ 0.11% 0.000 s 1x | 0.091 0.091 0.091 ms] snode_writer_2_kernel_0_serial [ 0.07% 0.000 s 3x | 0.019 0.020 0.022 ms] tensor_to_ext_arr_c6_5_kernel_0_range_for [ 0.06% 0.000 s 6x | 0.005 0.008 0.011 ms] snode_reader_68_kernel_0_serial [ 0.05% 0.000 s 1x | 0.044 0.044 0.044 ms] kernel_initial_state_variables_c228_0_kernel_1_range_for [ 0.03% 0.000 s 6x | 0.003 0.004 0.006 ms] runtime_initialize_snodes [ 0.03% 0.000 s 3x | 0.006 0.007 0.009 ms] snode_reader_69_kernel_0_serial [ 0.03% 0.000 s 1x | 0.020 0.020 0.020 ms] set_displacement_contraint_c292_0_kernel_1_range_for [ 0.02% 0.000 s 3x | 0.005 0.007 0.009 ms] snode_reader_70_kernel_0_serial [ 0.02% 0.000 s 3x | 0.004 0.006 0.008 ms] snode_reader_153_kernel_0_serial [ 0.02% 0.000 s 1x | 0.017 0.017 0.017 ms] kernel_initialize_boundary_c270_0_kernel_0_range_for [ 0.02% 0.000 s 1x | 0.013 0.013 0.013 ms] global_update_c158_0_kernel_0_serial [ 0.02% 0.000 s 1x | 0.013 0.013 0.013 ms] snode_reader_124_kernel_0_serial [ 0.01% 0.000 s 1x | 0.010 0.010 0.010 ms] kernel_mass_p2g_c350_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.010 0.010 0.010 ms] snode_writer_69_kernel_0_serial [ 0.01% 0.000 s 1x | 0.010 0.010 0.010 ms] kernel_initial_state_variables_c228_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] kernel_apply_vigot_stress__c594_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] field_fill_python_scope_c40_0_kernel_0_range_for [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] snode_writer_70_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] snode_writer_68_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] snode_reader_2_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] snode_writer_71_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] snode_writer_72_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] kernel_add_body__c608_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.009 0.009 0.009 ms] compute_elastic_stiffness_matrix_c232_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.008 0.008 0.008 ms] runtime_initialize_runtime_context_buffer [ 0.01% 0.000 s 1x | 0.008 0.008 0.008 ms] estimate_active_dofs_c178_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.005 0.005 0.005 ms] set_displacement_contraint_c292_0_kernel_0_serial [ 0.01% 0.000 s 1x | 0.004 0.004 0.004 ms] set_displacement_contraint_c292_0_kernel_2_serial [ 0.01% 0.000 s 1x | 0.004 0.004 0.004 ms] estimate_active_dofs_c178_0_kernel_2_serial
Is that means there are some errors in my kernel? If so, how can I fix it? Thanks!
When I print the kernel profile after running my code, I notice a strange kernel named runtime_retrieve_and_reset_error_code in my profile.
Is that means there are some errors in my kernel? If so, how can I fix it? Thanks!