Adding NDT reinitalization statement and reinitialization while moving

ataparlar commented 2 years ago

Checklist

[X] I've read the contribution guidelines.
[X] I've searched other issues and no duplicate issues were found.
[X] I've agreed with the maintainers that I can plan this task.

Description

NDT is initializes the position once in the beginning while vehicle is motionless. While moving if a topic used by autoware_launch stops streaming and starts streaming again, NDT reinitialization never starts. Eventually, vehicle starts to fly or starts turning around. There is no reinitialization statement of NDT in the code. There must be a recovery mechanism for localization.

Here is a video of the issue. https://youtu.be/MffXnseMeg0

Purpose

Making reinitalization of NDT correctly while moving.

Possible approaches

Writing a reinitialization statement to NDT node.

Definition of done

This is done when NDT reinitializes itself correctly without flying or turning.

YamatoAndo commented 2 years ago

I think the reinitalization is necessary. However, I think it would be more versatile to put this function in a pose_initilizer node, etc., rather than in an NDT node, because it can be used even when the NDT node is not used.

I think that this feature will be one of the most important features for Autoware, so we need to discuss the architecture in detail. @mitsudome-r @yukkysaito

cyn-liu commented 2 years ago

I have encountered a similar problem of how to reinitialize the NDT when it failed to match and localization failed, the vehicle starts turning around. The following figure is a screenshot of the simulation of the localization module when the car passes the intersection and the point cloud matching fails, the blue line in the figure is the trajectory of the car.

We believe that a reasonable solution in this case would be to first bring the vehicle to a slow stop, and then reinitialize the NDT. However, we have the following questions:

what flags are used to monitor localization failure. (my idea is to determine whether the localization fails based on whether score of the NDT match is greater than the threshold value of 2.3 )
How to make the car stop slowly when the localization fails.(the current localization error monitoring module does not notify other modules to take emergency action after the localization failure.)

isamu-takagi commented 2 years ago

How to make the car stop slowly when the localization fails.(the current localization error monitoring module does not notify other modules to take emergency action after the localization failure.)

We are now designing a mechanism to stop the vehicle in an emergency. Related information can be found here (note that this link will be invalid once the PR is merged). In this mechanism, a module that manages emergencies (fail-safe module in the figure on the above link) monitors the score from localization and activates MRM when an abnormality is detected.

YamatoAndo commented 2 years ago

what flags are used to monitor localization failure. (my idea is to determine whether the localization fails based on whether score of the NDT match is greater than the threshold value of 2.3 )

We have the localization_error_monitor node to detect locazation failure using the variance estimated by EKF. How about using this node?

cyn-liu commented 2 years ago

what flags are used to monitor localization failure. (my idea is to determine whether the localization fails based on whether score of the NDT match is greater than the threshold value of 2.3 )

We have the localization_error_monitor node to detect locazation failure using the variance estimated by EKF. How about using this node?

We believe that it is not accurate to use the pose_with_covariance estimated by the EKF in the localization_error_monitor node to monitor whether the localization fails, because this variable lacks timeliness. When the NDT score is below 2.3, we consider that the localization has failed at this point, while the ekf covariance is still small (close to 0), and the pose_with_covariance will only change significantly when the car is far from the correct position. The following figure is a screenshot of the simulation of the localization module when the point cloud matching fails, and the blue line in the figure is the running track of the car. We recorded two sets of data during the simulation (NDT matching score and pose_with_covariance estimated by EKF ) and plot the curve to observe the trend of the datas (the red horizontal dashed line in the figure is y=2.3). The covariance of x-x and y-y does not change when the matching score starts below 2.3, and these two variables suddenly increase sharply when the car is far from the location. So we believe that using the current EKF estimated variance (pose_with_covariance) is not effective in monitoring the localization failure. The current localization_error_monitor node needs to be modified.

YamatoAndo commented 2 years ago

@cyn-liu I think you didn't use the odometry in your test. I think if you put the odometry in the EKF, the estimated variance would be more reliable.

cyn-liu commented 2 years ago

@cyn-liu I think you didn't use the odometry in your test. I think if you put the odometry in the EKF, the estimated variance would be more reliable.

Thanks for your replay, we will add odometer information for testing to verify that the variable estimated by the EKF are reliable.

cyn-liu commented 2 years ago

@cyn-liu I think you didn't use the odometry in your test. I think if you put the odometry in the EKF, the estimated variance would be more reliable.

We added odometer information (wheel speed + IMU) to re-run the experiment and found that although the EKF output pose_with_covariance changes smoothly with a small fluctuation range, it does not solve the real-time problem. (Perhaps this is a characteristic of the algorithm itself, and the calculation of EKF covariance depends on the results of NDT We did several experiments to plot the graphs using the output results of NDT score and EKF covariance during localization (the red vertical line in the figure below is the time when the NDT score is <2.3 for the first time, and the blue vertical line in the graph is the time when x-x covariance or y-y covariance of EKF is >0.1 for the first time). Assuming that x-x covariance or y-y covariance is greater than 0.1 (the localization error is 0.3m), we consider the localization failure. As in Figure 1, the two vertical lines are about 3s apart, indicating that the EKF found the localization failure problem 3s later than the NDT. As in Figure 2, the two vertical lines are about 4s apart, indicating that the EKF found the localization failure problem 4s later than the NDT. The EKF covariance finds localization failure later than NDT score, and several experiments have found that it generally does not exceed 5s. Although using the EKF output result as the flag for localization failure cannot be very timely in detecting lost localization, if the delay is considered acceptable, the variable can correctly reflect whether the localization has failed. It should be more timely to use the NDT matching score for determining whether the localization is failure or not.

cyn-liu commented 2 years ago

How to make the car stop slowly when the localization fails.(the current localization error monitoring module does not notify other modules to take emergency action after the localization failure.)

We are now designing a mechanism to stop the vehicle in an emergency. Related information can be found here (note that this link will be invalid once the PR is merged). In this mechanism, a module that manages emergencies (fail-safe module in the figure on the above link) monitors the score from localization and activates MRM when an abnormality is detected.

Thanks for your job and I would like to ask about the progress of this PR so that I can make improvements to the localization module based on this work. Looking forward to your reply!

meliketanrikulu commented 2 years ago

How to make the car stop slowly when the localization fails.(the current localization error monitoring module does not notify other modules to take emergency action after the localization failure.)

We are now designing a mechanism to stop the vehicle in an emergency. Related information can be found here (note that this link will be invalid once the PR is merged). In this mechanism, a module that manages emergencies (fail-safe module in the figure on the above link) monitors the score from localization and activates MRM when an abnormality is detected.

Hello. Thanks for your job.I think this is your merged PR about documentation. Could you please share with us if there is any progress in the development phase regarding the situation where the localization fails.

I think we agree that the NDT score can be used to determine the break off the NDT. I want to start working to be able to reinitialize the score based on it. Do you have any other suggestions about this? Thanks for your suggestions.

meliketanrikulu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

mitsudome-r commented 2 years ago

From yesterday's ASWG, we reviewed the scope of this issue. I have confirmed with LeoDrive that their current demand is to detect localization failure as quick as possible. We can discuss about automatic reinitialization as mid/long-term goal.

I'm currently asking localization team if they have any tools to help you find the best threshold or any idea that might help you detect the localization failure.

cyn-liu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

To be honest, I dont think NDT match score is best choice to determine if the localization fails , I just compare the output of NDT match more reliably than the output of EKF to judge if the localization fails.

In the test video you shared, the NDT match score is still higher than the threshold for a while after the localization is lost. Did you check the variable of iteration_num when this happened, maybe it is already greater than the threshold of 32, usually the NDT score or iteration_num deviates from the threshold after the localization is lost.

meliketanrikulu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

To be honest, I dont think NDT match score is best choice to determine if the localization fails , I just compare the output of NDT match more reliably than the output of EKF to judge if the localization fails.

In the test video you shared, the NDT match score is still higher than the threshold for a while after the localization is lost. Did you check the variable of iteration_num when this happened, maybe it is already greater than the threshold of 32, usually the NDT score or iteration_num deviates from the threshold after the localization is lost.

Thanks for your reply. Yes our iteration_number is 50. Maybe it's the reason for our results and I will check that. But I think it reacts much, much longer than it should. I also honestly think that NDT score is not a reliable method to detect localization error. What do you think about comparing the position from Gnss with the localization output? Maybe we can get a faster result.

cyn-liu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

To be honest, I dont think NDT match score is best choice to determine if the localization fails , I just compare the output of NDT match more reliably than the output of EKF to judge if the localization fails. In the test video you shared, the NDT match score is still higher than the threshold for a while after the localization is lost. Did you check the variable of iteration_num when this happened, maybe it is already greater than the threshold of 32, usually the NDT score or iteration_num deviates from the threshold after the localization is lost.

Thanks for your reply. Yes our iteration_number is 50. Maybe it's the reason for our results and I will check that. But I think it reacts much, much longer than it should. I also honestly think that NDT score is not a reliable method to detect localization error. What do you think about comparing the position from Gnss with the localization output? Maybe we can get a faster result.

GNSS usually requires RTK correction to achieve centimeter-level accuracy, but due to occlusion (mountains, shrubs, tall buildings, tunnels, etc.) and poor 4G signal coverage, RTK is always in not-fixed status and the accuracy will drop to meter-level, so it is important to consider a good signal throughout the test road when using GNSS.

My test as below shows, using the integrated GNSS initialization position, the blue arrows in the figure indicate the position and direction of GNSS, I found that when NDT matching localization is successful, the position of GNSS may be inaccurate or even deviated from the distance greater than 5 m. Therefore, NDT matching localization is more reliable during vehicle operation at high speed (in scenarios with rich point cloud features). One of the bag's test results video

meliketanrikulu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

To be honest, I dont think NDT match score is best choice to determine if the localization fails , I just compare the output of NDT match more reliably than the output of EKF to judge if the localization fails. In the test video you shared, the NDT match score is still higher than the threshold for a while after the localization is lost. Did you check the variable of iteration_num when this happened, maybe it is already greater than the threshold of 32, usually the NDT score or iteration_num deviates from the threshold after the localization is lost.

Thanks for your reply. Yes our iteration_number is 50. Maybe it's the reason for our results and I will check that. But I think it reacts much, much longer than it should. I also honestly think that NDT score is not a reliable method to detect localization error. What do you think about comparing the position from Gnss with the localization output? Maybe we can get a faster result.

GNSS usually requires RTK correction to achieve centimeter-level accuracy, but due to occlusion (mountains, shrubs, tall buildings, tunnels, etc.) and poor 4G signal coverage, RTK is always in not-fixed status and the accuracy will drop to meter-level, so it is important to consider a good signal throughout the test road when using GNSS.

My test as below shows, using the integrated GNSS initialization position, the blue arrows in the figure indicate the position and direction of GNSS, I found that when NDT matching localization is successful, the position of GNSS may be inaccurate or even deviated from the distance greater than 5 m. Therefore, NDT matching localization is more reliable during vehicle operation at high speed (in scenarios with rich point cloud features). One of the bag's test results video

Hello @cyn-liu . You are not using RTK during this test, are you, or is your RTK connection dropping frequently? Your errors seem a little more than expected. Sometimes incorrect gnss-ins configurations can also cause this error. When we drive using RTK, our error values go up to a maximum of 40-50 cm (we work in an area with a lot of trees). During the general driving, our errors are 2 cm or less. Even when we tested it without using RTK, our error is around 1.5 m maximum. Therefore, we consider it a reliable source.

cyn-liu commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

To be honest, I dont think NDT match score is best choice to determine if the localization fails , I just compare the output of NDT match more reliably than the output of EKF to judge if the localization fails. In the test video you shared, the NDT match score is still higher than the threshold for a while after the localization is lost. Did you check the variable of iteration_num when this happened, maybe it is already greater than the threshold of 32, usually the NDT score or iteration_num deviates from the threshold after the localization is lost.

Thanks for your reply. Yes our iteration_number is 50. Maybe it's the reason for our results and I will check that. But I think it reacts much, much longer than it should. I also honestly think that NDT score is not a reliable method to detect localization error. What do you think about comparing the position from Gnss with the localization output? Maybe we can get a faster result.

GNSS usually requires RTK correction to achieve centimeter-level accuracy, but due to occlusion (mountains, shrubs, tall buildings, tunnels, etc.) and poor 4G signal coverage, RTK is always in not-fixed status and the accuracy will drop to meter-level, so it is important to consider a good signal throughout the test road when using GNSS. My test as below shows, using the integrated GNSS initialization position, the blue arrows in the figure indicate the position and direction of GNSS, I found that when NDT matching localization is successful, the position of GNSS may be inaccurate or even deviated from the distance greater than 5 m. Therefore, NDT matching localization is more reliable during vehicle operation at high speed (in scenarios with rich point cloud features). One of the bag's test results video

Hello @cyn-liu . You are not using RTK during this test, are you, or is your RTK connection dropping frequently? Your errors seem a little more than expected. Sometimes incorrect gnss-ins configurations can also cause this error. When we drive using RTK, our error values go up to a maximum of 40-50 cm (we work in an area with a lot of trees). During the general driving, our errors are 2 cm or less. Even when we tested it without using RTK, our error is around 1.5 m maximum. Therefore, we consider it a reliable source.

We used RTK, the results of this bag test is only an extreme case, in general, the error is not large.

mitsudome-r commented 2 years ago

@cyn-liu Are you still investigating this issue?

cyn-liu commented 2 years ago

@cyn-liu Are you still investigating this issue?

I don't have a better way about which flag to use for faster and more accurate monitoring of whether the positioning is lost or not. The current localization module can already initialize automatically when the vehicle is stopped, but if the vehicle loses its positioning while it is running, the automatic initialization is not possible because the current localization module does not notify other modules to stop the vehicle.

Now the remaining problem is to achieve automatic initialization of positioning during vehicle operation, and I think the following two things need to be done.

Select a location loss flag.
Notify other modules that positioning has been lost and need to make the vehicle stop.

For 1, my previous view was to use the NDT matching convergence flag (NDT matching score less than the threshold and NDT iterations greater than the threshold) as the flag for whether the locus is lost or not, because it detects locus loss faster than the pose_covariance of the EKF module, but after discussion, that flag is not the best choice either, so I don't have a better solution yet. For 2, in order to make the vehicle stop smoothly, it needs to send the location loss information to the planning module or the control module, and this needs to be discussed and modified with other module developers, so I did not progress.

xmfcx commented 2 years ago

In the current state of the Autoware,

The localization nodes report performance,

Using https://github.com/autowarefoundation/autoware.universe/blob/main/system/system_error_monitor/config/diagnostic_aggregator/localization.param.yaml#L27 the system_error_monitor publishes HazardStatus

Then this message is fed to the emergency_handler to perform the emergency stop.

wjxjmj commented 2 years ago

When we test the ndt score, as you said, in general the score falls below 2.3 when ndt deteriorates. However, we noticed that it does not immediately fall below this threshold value every time. Here I shared an example video of this. Even if the NDT fails, you can see here that it does not fall below the NDT score for a long time. Therefore, we have doubts that this score can be used as a reliable source for localization degradation.

I encountered the same problem in my tests. Theoretically, the NDT score measures how well the point cloud matches the map, and that score should have a significant difference between successful and failed localization. But as the same as the previous disccusion, this was not reflected in the test.

It was observed that the NDT score remained at a certain level when localization failed, implying that most of the locations in the map could be partially matched with the point cloud information. This leads me to guess that the problem is due to the ground information.

Since both map building and vehicle positioning are done on the road, the shape of the vehicle vicinity is similar: they are both flat. If the ground information is removed during localization, it may be possible to increase the difference in NDT scores between successful and unsuccessful localization.

I having been working with @cyn-liu to verify the validity of the idea. Considering that the algorithm for removing the ground from the point cloud is already included in autoware, we give a detailed comparative study in this new issue.

cyn-liu commented 2 years ago

In the current state of the Autoware,

The localization nodes report performance,

Using https://github.com/autowarefoundation/autoware.universe/blob/main/system/system_error_monitor/config/diagnostic_aggregator/localization.param.yaml#L27 the system_error_monitor publishes HazardStatus

Then this message is fed to the emergency_handler to perform the emergency stop.

Thanks for your answer, but I am not familiar with the Diagnostic Aggregators module and it will take me some time to understand and use it.

mitsudome-r commented 2 years ago

Relevant issue: https://github.com/autowarefoundation/autoware.universe/issues/2044

stale[bot] commented 1 year ago

This pull request has been automatically marked as stale because it has not had recent activity.

stale[bot] commented 11 months ago

This pull request has been automatically marked as stale because it has not had recent activity.

autowarefoundation / autoware.universe