Open ShangWeize opened 2 years ago
That line is just logging the results for the current scene in the loop. So it does not correspond to any results. The results are printed after the loop is done. So in L789 in nuscenes_test.py. Since the static classes are not one-hot (a pixel can be a crosswalk and a drivable area) we report them separately. For the objects, it is one-hot so we also report confusion matrix.
Thank you so much for clearing up my confusion!
Hello, in the test, I have the following three questions: 1. I found that the following two lines of code assign values to bev_total_relative_endpoints. What is the difference between them? bev_total_relative_endpoints = [combined_end] bev_total_relative_endpoints = [tf.concat([combined_end, bigger_resized_combined_projected_estimates], axis=-1)]
**2. I don't quite understand the difference between total_input and bev_total_relative_endpoints in mem_net.my_bev_object_decoder. Moreover, I don't quite understand what role endpoints play in all network structures. I checked a lot of information on this question and found no relevant answers.
1) The version we used in the paper is bev_total_relative_endpoints = [combined_end]. Which is the version used in nuscenes_test.py and is compatible with the provided checkpoint. The other one bev_total_relative_endpoints = [tf.concat([combined_end, bigger_resized_combined_projected_estimates], axis=-1)] was an experimental version. You can train with it and change the test.py accordingly or use the version in nuscenes_test.py to reproduce the results. 2) Endpoints refer to the intermediate representations of the encoder (backbone) that are used in decoder to provide low level but high resolution information to the decoder. I recommend original U-Net paper. 3) If you remove the image, what is the method going to use as input? Setting batch size to 0 might have been interpreted by tensorflow as any batch size.
Hi, I used the parametric model you provided: bev-stitch-nusc. When running test.py, the output is: temp_string = "Iteration : " + str(iteration) + " : Scene " + str(my_scene_token)+ " - j1: " + str(np.mean(temp_res,axis=0)) Here j1 represents the accuracy of 4 kinds of static. I am confused which result of j1 corresponds to which result in the paper?