Insufficient time sync / pose accuracy of depth data

CJuette commented 2 years ago

Hello!

I recently started working with benchbot and while investigating the data provided by it, I was surprised to find that the depth-data seems to be fairly off occasionally.

Here is an example from the miniroom sequence, where I plotted the points that I got from the observations-dict, transformed to map frame, using a custom agent. I plotted the points in black with an opacity of 10%, so completely black points show good registration.

Laser scanner: laser_topdown_crop

Slice of depth-channel: pcl_topdown_crop

Obviously the scanner-image is "darker" in total because the scan is always 360°, while the camera only points in certain directions. However you can also see, that the registration for the depth-camera is sometimes a bit of (on the right and on the top), and occasionally really off (on the left).

The benchbot_run command was the following: ./bin/benchbot_run -f --robot carter_omni --env miniroom:1 --task semantic_slam:passive:ground_truth

I was also able to reproduce this with the house environment; I did not investigate other environments yet. I wonder if this might also be related to #44, since the robot moves quite erratically at times.

Anyways, I hope you can support with this. Thank you in advance! :)

btalb commented 2 years ago

Thanks for a great bug report @CJuette ; your debugging process is very helpful.

We'll try and see if we can reproduce this locally. Am I correct in guessing that it appears to be a single frame of depth that was off by an unacceptable amount?

In the meantime, can you check if you can reproduce it in active mode for us please? Something as simple as rotating the robot 30 degrees every action at the starting pose, and leaving it run should be enough to see.

Bugs like these are unfortunately hard to track down until we can reliably reproduce.

@david2611 do you remember seeing anything like this in your testing?

CJuette commented 2 years ago

Hi @btalb, thanks for your reply!

In the miniroom sequence, there was one frame that was off by a significant amount, but also others that were off by a small amount, which I guess might be tolerable, though not ideal. In the house sequence there were a couple of frames that were off significantly. I will try to also provide you with an example from that scene.

I also did a couple of experiments in active mode. I'm omitting the laser scan examples here since they always seem to look good.

24x Angular Rotation by 30°: pcl_topdown_move30_crop

24x Angular Rotation by 50°: pcl_topdown_move50_crop

As you can see, it doesn't look as bad as the passive trajectory, but is still not aligned well occasionally. After trying this, I additionally tried waiting 2 seconds after every action (by sleeping 2s, then submitting a rotation for 0°, and only then using the input).

24x Angular Rotation by 30° with 2s waiting after an action: pcl_topdown_move30_wait2_crop

24x Angular Rotation by 50° with 2s waiting after an action: pcl_topdown_move50_wait2_crop

As you can see, this already looks way better. So it might be a quick workaround, if not a solution, just to wait a bit for the robot to settle before calling the callback with the observations.

btalb commented 2 years ago

Thanks @CJuette , that information helps a lot.

We'll have a look internally and see if there's an appropriate fix. Tracking down the cause might be difficult, especially if it's system hardware dependent.

So, as much as it's not my preference, the adding a sleep solution might end up being the result.

@david2611 will have a look into this next week.

david2611 commented 2 years ago

Hi @CJuette. Sorry for being so delayed in responding to your issue. As far as we can tell, this is directly linked to the Omniverse simulator which seems to not always provide synchronized data. We had hoped that there would be sorted in the port to Omniverse (bug was originally in Unreal as shown in #12).

As suggested, to "fix" this we have added a very short delay between when a move action has been completed and when it's observations can be sent through to the user. The delay is currently set to 0.6 seconds and the change is currently available on the develop branch (to be shortly pushed onto master). If you want to try it out before the master push just perform a benchbot_install -b develop to have your benchbot system synced up with the current develop branch for all repos.

If this still is insufficient on your hardware just let us know :+1:

qcr / benchbot

Insufficient time sync / pose accuracy of depth data #62