IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.55k stars 4.82k forks source link

T265 pose data nan #4518

Closed mickeyouyou closed 4 years ago

mickeyouyou commented 5 years ago

Required Info
Camera Model { T265 }
Firmware Version (Open RealSense Viewer --> Click info)
Operating System & Version {Linux (Ubuntu 18.04)
Kernel Version (Linux Only) (e.g. 4.14.13)
Platform NVIDIA Jetson Nano
SDK Version { legacy / 2.<?>.<?> }
Language {C++ }
Segment {Little Automible }

Issue Description

I used realsense and intergrated apollo cyber https://github.com/mickeyouyou/realsenseOnCyber. librealsense SDK 2.21.0 version

rcently we meet this problem unstable pose data, 4 times 2 days. is fixed in new SDK like 2.24?

image

kylesaltmarsh commented 5 years ago

We are also experiencing this issue with the linear acceleration/angular velocity giving nan values. This is then breaking our broader SLAM solution which is using the T265 as a VIO sensor.

mickeyouyou commented 5 years ago

is any updates on this issue ?

harville commented 5 years ago

@kylesaltmarsh and @mickeyouyou: Can you describe the most common scenario(s) (i.e. what sequence of motions, what kind of environment, any motion happening in scene, etc.) that cause the NaN values to occur? Also, the new 2.26.0 version of librealsense fixes some issues with recording T265 raw data, so we might consider the option of trying to have you make a recording of usage leading up to a failure like this, to help us debug.

mickeyouyou commented 5 years ago

We use apollo cyber RT as our framework there is a lite version here. https://github.com/mickeyouyou/apollo_lite

scene : we launch the realsense driver for cyber RT, data is normal, but there will be problem while run some minunes.

rgreid commented 5 years ago

@harville replying for @kylesaltmarsh: in our case, in the ~10 times we've had the pose estimate turn into nans, the T265 has been in a feature-rich scene and not moving. An odd situation for the filter to diverge! A couple of times the same sensor has diverged while in motion (we've observed the pose estimate drifting rapidly away to infinity). We'll aim to test 2.26.0 with a few different sensors.

djkayip commented 5 years ago

I have T265 attached to a marathon runner to correct the positional data from GPS. After running on the track for 2 -3 minutes, T265 starts giving us nan. it that because of limitation of VSLAM? Do I need to avoid rapid movement?

mattcenoo commented 5 years ago

Hi, I'm also experiencing the issue with: FW version: 0.1.0.279 SDK Version: v2.27.0 if switch the option RS2_OPTION_ENABLE_MAPPING off, the pose data turn to be nan in a few minutes; but if this option is on, the issue seldom appear. any suggestions?

RealSenseCustomerSupport commented 5 years ago

@mickeyouyou, @mattcenoo, @djkayip, @rgreid

Hi Can you also add still images of the environment or scene that the T265 is seeing when you're hitting this nan issue? It's a bit tough to get actual recordings from when you're hitting the nan issue.

There is an outline in this thread that says the device is on a runner, how about some of the other scenarios? @rgreid - You mention that device just sitting there, was there any motion ahead having the device just sit there? You also mention while in motion, what specific motion or movement was done?

Thanks

djkayip commented 5 years ago
KakaoTalk_Photo_2019-10-05-08-15-34

This is the result when the device is on a runner in any circumstance. probably, after 3 - 4minutes, it starts to get Nan. Y axis value plummeted, and Nan appears.

I just wonder whether VSLAM couldn't manage it of not.

PedroHRPBS commented 4 years ago

Hello,

I am also experiencing this issue while using T265 as Vision positioning system on my drone. The camera is pointing down, with the USB cable to the back, sometimes when I leave the drone on the table, I start to receive these NaN values. While flying inside the lab, the issue never happened, the floor is made of different colors tiles, so there is a lot of features. My test procedure is to make a square-shape waypoint mission that repeats itself, and check if the drone is able to maintain the previous positions. In the lab the results were great, it keeps a pretty consistent square shape and it returns to the correct position on landing.

Yesterday I went for the first outdoor trial and the drone crashed. Unfortunately, I didn't save the data of the flight, but I'll try to explain the procedure. I tried first the same square-shape waypoint mission. But the drone started to lose altitude when it reached a less-feature ground area. The behavior repeated itself after 2 or 3 trials. And every time, when I looked at the PC, the NaN values were there, crashing my ROS nodes.

So I decided to fly it on Position Hold and check the behavior manually. https://drive.google.com/file/d/0B-NC33645NBsQ3daOVRXNnJGbjRKdEtIYUVOY2trQmF2WlBV/view?usp=sharing

In the video you can see that while the drone is kept on a more featured area (the center with red+green and basketball lines) the position is pretty fixed. And when I move to a less-feature area, the drone loses altitude again. I tried coming back to the center to regain control, and at first everything looked fine, but after a few seconds I completed lost control of it and it crashed when I tried to avoid it from hitting the cage around the area, switching to manual mode.

After the crash, I looked at the PC and it was all red texts with NaN values. I should have saved the logs, but I was too worried with the drone itself, tbh. But I'd like to offer my setup for tests to try to solve this issue.

I'm using a NUC, Intel T265 and Intel D435i. My launching procedure on ROS is: Cameras -> Mavros -> node that makes them communicate (Vision-to-mavros). It always worked fine on the lab, the positioning and mapping (with RTabmap and octomap).

Please, let me know how can I help more on this issue.

[ INFO] [1570687929.773274521]: RealSense ROS v2.2.5 [ INFO] [1570687929.773314368]: Running with LibRealSense v2.24.0

[ INFO] [1570687938.837467059]: Device Name: Intel RealSense T265 [ INFO] [1570687938.837495249]: Device Serial No: 908412111251 [ INFO] [1570687938.837518696]: Device FW version: 0.0.18.5715

RealSenseCustomerSupport commented 4 years ago

Thanks @PedroHRPBS for the video and the information. Really appreciate it. We will review it and if we have any further questions we will let you know.

neilyoung commented 4 years ago

I also can confirm to receive Nan, mostly out of the blue, after different times, standing still on the table.

manomitbal commented 4 years ago

Hi! I have this issue opened for the realsense ROS wrapper as well : https://github.com/IntelRealSense/realsense-ros/issues/955

neilyoung commented 4 years ago

@manomitbal Thanks for cross linking the issues. This is really an annoying problem, since it comes out of the blue, even if the scene is feature-rich. Maybe it is because the device is not moving? Anyway, right now I have no idea how to cope with this. Restarting the script does not help. Unplug/replug USB helps. Nice for robots, which can do that with their left arms, of course :))

neilyoung commented 4 years ago

Disregard, wrong thread :)

RealSenseCustomerSupport commented 4 years ago

Hi @PedroHRPBS @manomitbal @mickeyouyou, @mattcenoo, @djkayip, @rgreid

An updated LibRealSense release has been posted: https://github.com/IntelRealSense/librealsense/releases/tag/v2.30.0

There are updates which can have a positive affect on the NaN issues that you are seeing.

Please give this new release a test in your environments and let us know what you are seeing.

Thanks

rgreid commented 4 years ago

@RealSenseCustomerSupport sorry: it seems much worse today. Mounted on a skid-steer robot in a visually feature rich environment, both with and without odom input.

The nans occur consistently a short time after we finish driving around a small ~15 meter loop.

Previously we were getting nans several times a day, now we're getting them reliably every time we drive a test loop. Thanks.

Device Name: Intel RealSense T265 Device FW version: 0.2.0.857

BriceRenaudeau commented 4 years ago

I could reproduce the nans by moving my robot against a white wall (docked position). The problem is that the camera cannot recover from this nans state and I have to relaunch the driver.

harville commented 4 years ago

@rgreid Thanks for this feedback. In our testing, the updates in release 2.30.0 completely fixed some scenarios, but did not solve others. We did not observe it making any scenario worse, but we cannot try every possibility. It sounds like you have a scenario that reliably reproduces the problem, which is very not good for you, but could be helpful for our debugging and understanding. Can you please give some more detail about exactly what the scenario is? For example:

1) How long does the device take to go around the 15m loop, and does it sit static for a while (how long?) before nan poses occur? 2) How much vibration do you think the device is undergoing? And/or, any quick accelerations that could cause it to briefly shake in a strong way?

@BriceRenaudeau Also, note that tracker restart, after NaN poses are encountered, can be accomplished without unplugging/re-plugging the USB connection, by calling pipe.stop() followed by pipe.start(). We do not consider this a proper fix for the NaN issue discussed here, because this will start a new tracking session, which is not desirable in most application contexts. But it's better than having to physically unplug and replug the device.

BriceRenaudeau commented 4 years ago

Thanks @harville, I am using the camera trough ROS, I am not unplugging the camera, I just kill and restart the ROS node.

ArkadiuszNiemiec commented 4 years ago

I have a RealSense facing down, around 10-15 cm from the ground, looking at the carpet. After 15-20 minutes without any movement position starts drifts and gives eventually gives NANs. No vibrations etc.

partlygloudy commented 4 years ago

@PedroHRPBS @RealSenseCustomerSupport I'm wondering if you've been able to solve this issue? We've been experiencing almost the exact same issue in the past few days. We're using a T265 on a drone - it's about 1kg, Pixhawk flight controller, pretty low vibrations. We've been able to get the drone to hover in Loiter mode in the lab and can move it around from side to side. When we moved to a large fieldhouse, we started getting nans immediately after takeoff. This causes the drone to rocket into the air at full throttle. This happened several times in a row and eventually led to a crash.

Any ideas on how to resolve this? Our first thought was that it's probably an issue with excessive vibration (we've seen the videos where people shake the Realsense and the pose drifts rapidly in one direction), however this wouldn't explain why it worked in the lab but not the fieldhouse.

We're happy to provide any additional details, data, videos, etc. that may be helpful for resolving this

radfordi commented 4 years ago

Hi @partlygloudy, if you are using the latest release, then the cause is likely vibrations. We suggest mounting with a dampening material.

partlygloudy commented 4 years ago

So we've now tried a variety of dampening strategies including rubber mounts, various types of foam, and a wire-rope isolator. None of them have seemed to have any significant impact on the accelerations reported by the Realsense. We're now starting to suspect that some combination of prop wash and vibration coming through the USB cable may be thwarting all of our damping attempts.

We're at the point where things go well about 50-75% of the time (both in the lab and in the larger fieldhouse) but we still have many flights where the pose data is way off from the beginning. Interestingly, we've never had a flight where takeoff was successful and the pose started drifting mid-flight (granted we haven't been particularly aggressive on the successful flights.) Starting to run thin on ideas for how to proceed :/

msadowski commented 4 years ago

I'm testing the T265 (again) now and I'm running into the same issue with the camera being static. The way I reproduced it:

Here is the fisheye view while the camera is static:

image

I'm currently on the latest official release of the package on ROS Melodic (installed through apt-get).

It's a long shot but maybe it will somehow help someone from Intel in troubleshooting the issue (@RealSense-Customer-Engineering ?) but I have run into something similar about 6 years back: we were working on some drone autopilots back then and we have noticed that if we've left the autopilot static for long period of time (~6-24 hours) the EKF would start reporting NaN values. I don't exactly remember what the issue was back then or what was the fix but maybe it's a lead someone will find useful. I will also see if I can reproduce this again and grab a bag file.

msadowski commented 4 years ago

An update to my testing: It took quite a while to reproduce it but I managed to trigger the failure (and I have a bag file with the raw data minus the camera feed).

The way I reproduce it is exactly the same as I described in the previous comment.

Here are some highlights from the bagfile (shout out to PlotJoggler):

Accelerations (to give you an idea when I moved the camera):

image

As you can see most of the time the camera was rather static.

Here is the plot with the odometry (pose only):

image

Zoom in on Odometry and accelerations close to the time it breaks:

image image

You can see that even though the camera is static the pose suddenly starts drifting, and then reaches NaN around 3600 (that's surprisingly round number). Looking at the odom covariance you can see how it explodes when the odom goes to NaN:

image

Looking at the odom orientation quaternion it also goes loco around the time the odom sample goes into NaN (note that camera was not moving based on accelerometer values):

image

The view from the camera fisheye stream (didn't move the camera but I relaunched the driver to capture it): image

Does any of this information help in any way? If the bagfile would be helpful for troubleshooting this further then let me know within the next 5 days and I can upload it.


I made yet another bag file, this time keeping the camera fully static. Here are the results: Acceleration:

image

Position:

image

Orientation in degrees:

image

Zoom in on yaw_deg:

image

Zoom twist on angular twist around z axis (yaw rate):

image

Some observations:

msadowski commented 4 years ago

Here is another observation: The issue doesn't seem to happen (from my limited testing) when inputting the wheel odometry. In my test I've run the t265 exactly like I did before but in ROS node I enabled odometry input and fed 0 position and 0 velocity at a fixed rate of 100Hz. I run the test for 1 hour 45 minutes without any NaNs. Hopefully this means that when providing odom input the t265 shouldn't rain NaNs.

Unfortunately I don't think this will be helpful at all for anyone looking into using these units as handheld modules or with drones.

rgreid commented 4 years ago

@msadowski we've been debugging the NaN issue for a while now. For us, it occurs both with and without odom input. Note the sample images you provided above are quite lacking in visual features.

@RealSenseCustomerSupport are you still investigating the NaN issue? For us, recent firmware builds have been much more stable. We still get NaNs occasionally after the robot docks and other times when it stops moving for a few seconds.

RealSenseSupport commented 4 years ago

Thank you for highlighting the unfortunate continuation of the NaN issue while using T265. We have moved our focus to our next generation of products and consequently, we will not be addressing this issue in the T265.

wonwon0 commented 3 years ago

@RealSenseSupport i wish my credit card output a NaN when i paid for these sensors.

NJUSSJ commented 2 years ago

@PedroHRPBS @RealSenseCustomerSupport I'm wondering if you've been able to solve this issue? We've been experiencing almost the exact same issue in the past few days. We're using a T265 on a drone - it's about 1kg, Pixhawk flight controller, pretty low vibrations. We've been able to get the drone to hover in Loiter mode in the lab and can move it around from side to side. When we moved to a large fieldhouse, we started getting nans immediately after takeoff. This causes the drone to rocket into the air at full throttle. This happened several times in a row and eventually led to a crash.

Any ideas on how to resolve this? Our first thought was that it's probably an issue with excessive vibration (we've seen the videos where people shake the Realsense and the pose drifts rapidly in one direction), however this wouldn't explain why it worked in the lab but not the fieldhouse.

We're happy to provide any additional details, data, videos, etc. that may be helpful for resolving this

@partlygloudy Hi, Jake. I am wondering if your problem have been solved? I am now facing the same issue. My stack is PX4 + T265 and the T265 is mounted front facing as default, when the drone(tailsitter) is placed on the ground and moved around by hand the position data is normal and precise, but when I armed the drone and take off, the drone quickly lose control, it’s height rise suddenly(may be it can map to your description of rocket into the air) and with an angle in roll. Simultaneously, I can see the NaN info on my RaspPi 3B+ crash my ROS Node and the communication between RaspPi and PX4 stopped. I am wondering if the vibration after takeoff is too large since when the drone is moved around by hand the data is normal. Hope for your reply, thanks! Also you can contact me via email : fortune.shi@qq.com

harville commented 2 years ago

Hey Jake, Fortune, The Intel tracking camera team was working on fixes for the NaN problem in 2020, but then the project got shut down... damn shame. I think the T265 could have been a killer product, best seller in the RealSense product line (applications to robotics, AR/VR headsets, drones, and more) if properly supported by Intel RealSense. Unsurprisingly, all of Intel RealSense seems to be gone now. (I no longer work for Intel, don't know the details.) Anyway, a couple points that might help you:

  1. We observed that the NaN issue was often caused by long periods of no camera motion. This caused shrinking of some tracker state covariances to very small values, which resulted in instability in some of the state estimation compute operations. So if you can, try to not start your tracker too long before you actually start moving the drone or rocket.
  2. It often helps to do an "initialization motion" before using the tracking camera in some large or fast motion. The tracker needs to converge to a reasonable estimate of its state before it can be relied upon, or its estimate may "fly off" or do other bad things as you start to move it around more aggressively or extensively. I don't recall the exact recommendations for an initialization motion, but for example you could pick up the drone/rocket and put it back down again, not too rapidly, making sure to include a little rotation (e.g. 90 degrees and back again) in the process. Doing this prior to launch may result in better in-flight results. These are not ideal solutions, of course, but maybe they can improve the issues you are seeing. I wonder what it would take to port the T265 over to this platform (which shares same internal processor as T265), open source the code, and fix all the problems...

    On Saturday, January 8, 2022, 06:59:17 AM PST, Fortune @.***> wrote:

So we've now tried a variety of dampening strategies including rubber mounts, various types of foam, and a wire-rope isolator. None of them have seemed to have any significant impact on the accelerations reported by the Realsense. We're now starting to suspect that some combination of prop wash and vibration coming through the USB cable may be thwarting all of our damping attempts.

We're at the point where things go well about 50-75% of the time (both in the lab and in the larger fieldhouse) but we still have many flights where the pose data is way off from the beginning. Interestingly, we've never had a flight where takeoff was successful and the pose started drifting mid-flight (granted we haven't been particularly aggressive on the successful flights.) Starting to run thin on ideas for how to proceed :/

Hi, Jake. I am wondering if your problem have been solved? I am now facing the same issue. My stack is PX4 + T265 and the T265 is mounted front facing as default, when the drone(tailsitter) is placed on the ground and moved around by hand the position data is normal and precise, but when I armed the drone and take off, the drone quickly lose control, it’s height rise suddenly(may be it can map to your description of rocket into the air) and with an angle in roll. Simultaneously, I can see the NaN info on my RaspPi 3B+ crash my ROS Node and the communication between RaspPi and PX4 stopped. I am wondering if the vibration after takeoff is too large since when the drone is moved around by hand the data is normal. Hope for your reply, thanks! Also you can contact me via email : @.***

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>