IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.43k stars 4.8k forks source link

T265 stops sending data #6362

Open dschnabel opened 4 years ago

dschnabel commented 4 years ago
Required Info
Camera Model T265 & D435
Firmware Version T265: 0.2.0.926, D435: 05.12.03.00
Operating System & Version Raspbian 10.3
Kernel Version (Linux Only) 4.19.97-v7l+ armv7l
Platform Raspberry Pi 4
SDK Version 2.33.1
Language C++/ROS
Segment Robot

Issue Description

After upgrading from SDK 2.31.0 to 2.33.1 my T265 stops sending pose data when running for some time (1-5 min). This happens if the T265 is run together with the D435 and also if I run the T265 alone (less frequently).

This wasn't a problem in 2.31.0 but I was experiencing crashes in 2.31.0 so I decided to upgrade to the latest stable version (2.33.1).

In 2.33.1 I see no more crashes but I seem to be experiencing a very similar problem as described in https://github.com/IntelRealSense/librealsense/issues/5509#issuecomment-587422436.

I saw some discussion on missing serial numbers but in my case both cameras seem to get detected properly (see my ROS startup logs for T265 and D435).

When I monitor the odom topic from the T265, after a while I don't get any more pose (odom) data:

$ rostopic hz /odom

average rate: 200.380
    min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.380
    min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.380
    min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.315
    min: 0.000s max: 0.099s std dev: 0.01483s window: 50000
no new messages
no new messages
no new messages
no new messages
no new messages

For completeness here are my two ROS launch files:

Is there any fix for this?

dschnabel commented 4 years ago

@RealSenseCustomerSupport @RealSense-Customer-Engineering @MartyG-RealSense

Can you help us? Is this a known issue and if so, is there a workaround or proposed fix? Any additional information we can collect for you too look at?

This problem renders the T265 unusable for our project.

dschnabel commented 4 years ago

We did some more investigation and it looks like a firmware or a libusb issue.

This is how the pose/gyro/accel is being received from the T265 in librealsense:

  1. A new initial request is generated.
  2. This request gets forwarded to libusb.
  3. When a response arrives the interrupt_callback gets called.
  4. At the end of the callback function a new request is created.

Steps 2-4 are repeated. As long as the callback function _interrupt_callback gets called we receive data. But for some reason, occasionally the callback doesn't get called anymore, even if the request has been dispatched to libusb correctly without any errors. This is when the T265 stops sending pose/gyro/accel data.

We implemented a watchdog thread which monitors the callback and if the callback is not called anymore after some time we call stop_interrupt() followed by a start_interrupt(). We were hoping this would reinitialize the callback and cause the data to show up again. But this didn't help. Only a relaunch of the ROS node helped.

Unfortunately we cannot go deeper since we don't have the firmware of the T265. But our guess is that the T265 just wouldn't respond anymore.

@RealSenseCustomerSupport @RealSense-Customer-Engineering @MartyG-RealSense could you please see what could cause the callback function not getting called anymore? Are there logs I can enable in the T265 firmware?

dschnabel commented 4 years ago

We ran the ROS node in debug mode. These are the last few lines before T265 stops sending data:

[DEBUG] [1590375454.268119176]: Publish Pose stream
[DEBUG] [1590375454.272547826]: Frame arrived: stream: Gyro ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.272984024]: Frame arrived: stream: Pose ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.273135615]: Publish Pose stream
[DEBUG] [1590375454.273732108]: Frame arrived: stream: Accel ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.278416105]: Frame arrived: stream: Gyro ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.278897526]: Frame arrived: stream: Pose ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.279104097]: Publish Pose stream
[DEBUG] [1590375454.286298009]: Frame arrived: stream: Gyro ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.286856576]: Frame arrived: stream: Pose ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.287164961]: Publish Pose stream
[DEBUG] [1590375454.287998580]: Frame arrived: stream: Gyro ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.288404798]: Frame arrived: stream: Pose ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.288710257]: Publish Pose stream
[DEBUG] [1590375454.289378008]: Frame arrived: stream: Accel ; index: 0 ; Timestamp Domain: Global Time
[DEBUG] [1590375454.292816706]: Frame arrived: stream: Gyro ; index: 0 ; Timestamp Domain: Global Time

After the last line there's no more output.

csr-kick commented 4 years ago

I have also been seeing this exact issue, with the same setup. RPI4 with same OS, SDK version, with T265 and D435i. Happens with both ROS and simple C++ test scripts.

Data comes in from the T265 (using poll_for_frames, try_wait_for_frames, or wait_for_frames) until suddenly the frames just stop. Relaunching whichever program will get data from the device(s) just fine for another short period of time (but I do call hardware_reset() on the T265 as part of my code). The only data stream I am getting from the T265 is pose via: enable_stream(RS2_STREAM_POSE, RS2_FORMAT_6DOF);

Other Observations:

Personally have not had any success with pipe.stop() and pipe.start(). I tried to see if there was a usb disconnection or such that would account for this issue, but the log file attached does not show any events to support that. It really feels like there is something that crashes or hangs on the T265.

This problem is currently a showstopper for me, which is unfortunate since during the windows when it works the sensor is pretty great. I might try rolling back to 2.31, but based on what I read here and the release notes that may just be trading one set of problems for another. Please help.

udev log.txt

csr-kick commented 4 years ago

I switched to the development branch (2.35). The D435i is not recognized at all, but the T265 seemed to work better, but still crashes. The D435i shows up properly as a usb device:

Bus 002 Device 004: ID 8086:0b3a Intel Corp.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 012: ID 8087:0b37 Intel Corp.
Bus 001 Device 004: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

but the SDK refuses to detect it. The rs-enumerate-devices example does not see it, nor do tools like the fw-logger/fw-update.

As a side note, both sensors have always worked when plugged into a windows computer.

csr-kick commented 4 years ago

Using latest development branch at commit 2a79d31 merged by @ev-mp

Compiled and ran the rs-pose example with the line rs2::log_to_console(RS2_LOG_SEVERITY_INFO);

Attached is the log, failure after about 60 seconds of running. The message received after failure is:

02/06 01:04:05,676 INFO [2949632528] (tm-device.cpp:1408) T265 FW message: 13042424847758289392: [0x/2:721] Host not reading - stopping

Pose_Info_log_output.txt

Anyone have any ideas?

edit: Using RS2_LOG_SEVERITY_DEBUG we get this after the data stops:

02/06 01:23:56,838 DEBUG [2941239824] (tm-device.cpp:1473) T265 time synced, host_ns: 1591055918197446912
 02/06 01:23:56,848 DEBUG [2949632528] (tm-device.cpp:2031) Sending message DEV_GET_AND_CLEAR_EVENT_LOG length 6
 02/06 01:23:56,849 DEBUG [2949632528] (tm-device.cpp:2047) Receiving message with max_response_size 32776
 02/06 01:23:56,849 DEBUG [2949632528] (tm-device.cpp:2062) Received DEV_GET_AND_CLEAR_EVENT_LOG with length 8
 02/06 01:23:56,949 DEBUG [2949632528] (tm-device.cpp:2031) Sending message DEV_GET_AND_CLEAR_EVENT_LOG length 6
 02/06 01:23:56,949 DEBUG [2949632528] (tm-device.cpp:2047) Receiving message with max_response_size 32776
 02/06 01:23:56,950 DEBUG [2949632528] (tm-device.cpp:2062) Received DEV_GET_AND_CLEAR_EVENT_LOG with length 8
 02/06 01:23:57,050 DEBUG [2949632528] (tm-device.cpp:2031) Sending message DEV_GET_AND_CLEAR_EVENT_LOG length 6
 02/06 01:23:57,050 DEBUG [2949632528] (tm-device.cpp:2047) Receiving message with max_response_size 32776
 02/06 01:23:57,050 DEBUG [2949632528] (tm-device.cpp:2062) Received DEV_GET_AND_CLEAR_EVENT_LOG with length 8
 02/06 01:23:57,151 DEBUG [2949632528] (tm-device.cpp:2031) Sending message DEV_GET_AND_CLEAR_EVENT_LOG length 6
 02/06 01:23:57,151 DEBUG [2949632528] (tm-device.cpp:2047) Receiving message with max_response_size 32776
 02/06 01:23:57,151 DEBUG [2949632528] (tm-device.cpp:2062) Received DEV_GET_AND_CLEAR_EVENT_LOG with length 8
 02/06 01:23:57,251 DEBUG [2949632528] (tm-device.cpp:2031) Sending message DEV_GET_AND_CLEAR_EVENT_LOG length 6
 02/06 01:23:57,252 DEBUG [2949632528] (tm-device.cpp:2047) Receiving message with max_response_size 32776
 02/06 01:23:57,252 DEBUG [2949632528] (tm-device.cpp:2062) Received DEV_GET_AND_CLEAR_EVENT_LOG with length 8
 02/06 01:23:57,339 DEBUG [2941239824] (tm-device.cpp:2031) Sending message DEV_GET_TIME length 6
 02/06 01:23:57,339 DEBUG [2941239824] (tm-device.cpp:2047) Receiving message with max_response_size 16
 02/06 01:23:57,339 DEBUG [2941239824] (tm-device.cpp:2062) Received DEV_GET_TIME with length 16
 02/06 01:23:57,340 DEBUG [2941239824] (tm-device.cpp:1473) T265 time synced, host_ns: 1591055918197446912

Also, probably not important but when running at this debug level properly we get a flood of: 02/06 01:16:48,408 DEBUG [2958025232] (frame-archive.h:155) Frame Callback [Pose#1563] overdue. (Duration: 5.015381ms, FPS: 200, Max Duration: 4ms)

Max duration is calculated as 1000 / (frame->get_stream()->get_framerate() + 1); Is this correct? Feels like it should be more like (1000.f / (frame->get_stream()->get_framerate()) + 1 to get max of 6ms rather than 4ms on a message that comes in at an average of 5ms.

There is also this:

CallbackFinished,Pose,0,DispatchedAt,1591070836901.750732
 02/06 05:07:16,901 DEBUG [3070188800] (frame-archive.h:155) Frame Callback [Pose#0] overdue. (Duration: 1591070836901.750732ms, FPS: 200, Max Duration: 4ms)

Pose #0 is always overdue, every single cycle, with duration being equal to the DispatchedAt time.

dschnabel commented 4 years ago

Glad to hear I'm not the only one with this problem, @csr-kick. When you tried with the latest dev branch and you got the Host not reading - stopping error, did you have the D435 run in parallel or only the T265?

And is it still true that a pipe.stop()/pipe.start() won't fix it once you get into that state?

csr-kick commented 4 years ago

I can replicate the failure without the D435i plugged in, but it does seem to make it happen faster when it is running. The data I am collecting now is as bare as possible, running the rs-pose example with only the T265 plugged in at all to the Rpi4.

edit:

When I do variations on pipe.stop() and pipe.start() I am not getting any success.

Device Position: 0.142 -0.010 0.104 (meters)
Device Position: 0.142 -0.010 0.104 (meters) 
02/06 01:59:31,648 INFO [2949632528] (tm-device.cpp:1408) T265 FW message: 13042428173785044114: [0x/2:721] Host not reading - stopping
!*****restarting pipe*****!
 02/06 01:59:32,885 INFO [3070188800] (tm-info.cpp:53) Picked 1/4 devices
 02/06 01:59:32,899 ERROR [3070188800] (handle-libusb.h:95) failed to claim usb interface: 0, error: RS2_USB_STATUS_BUSY
 02/06 01:59:32,899 ERROR [3070188800] (types.h:307) Unable to open device interface
csr-kick commented 4 years ago

@MartyG-RealSense @dorodnic Any ideas/help on this issue? 'Host not responding' error makes me think the SDK stops processing messages or maybe requesting the message in the first place, so the sensor stops as well.

dschnabel commented 4 years ago

@csr-kick I was just thinking, what if you create a new pipe after stopping the old one? This should release any resource that the old pipe holds:

pipe->stop();
pipe = std::make_shared<rs2::pipeline>();
pipe->start();

I'm not home right now so I can't test this myself.


Edit: Ok so I tested this and it's working! I can recreate the pipe and am getting data again from the T265. This is not ideal but it's a workaround I guess until the root cause has been fixed.

Here's the code that worked for me

rs2::config cfg;
cfg.enable_stream(RS2_STREAM_POSE, RS2_FORMAT_6DOF);
auto pipe = std::make_shared<rs2::pipeline>();
pipe->start(cfg);

while (!ros::isShuttingDown()) {
    try {
        auto frames = pipe->wait_for_frames();
        // do something with the frames
    } catch (const rs2::error & e) {
        std::cout << "T265 crashed. Restarting ..." << std::endl;
        pipe->stop();
        pipe = std::make_shared<rs2::pipeline>();
        pipe->start(cfg);
        std::cout << "... restarted." << std::endl;
    }
}

@csr-kick can you give this a try?

csr-kick commented 4 years ago

Using your code snippet does recover, but I want to try it a few times with the D435i also running. However, this isnt a viable solution beyond basic prototyping.

  1. It takes 15 seconds for that error to be thrown when using pipe.wait_for_frames. Maybe one of the other frame wait methods will be able to fail faster, but any real delay will be a problem for a mobile robot.
  2. When it resets, your back at 0,0,0 meaning you would maybe have to track you actual last pose and then offset after each reset?
  3. How will exporting and importing the localization maps play with this? If you are far from the origin will they actually be able to re-localize? Do you then drop your manual offset after?

It really seems like somewhere in the SDK it stops processing the pose data coming from the sensor. The sensor sees the messages backing up and it stops sending to avoid any issues. The bulk data transfers and time syncing continue to work, which tells me that it is specific to whatever protocol the pose message uses. We also know that none of this happens if you plug into a windows computer and run the realsense viewer app, so its probably low in the code stack.

dschnabel commented 4 years ago

I agree that this is not a fix and ultimately we need something better than restarting the pipe. I think the workaround can work fairly well though if configured properly but we need to do some more testing.

To your questions:

Re 1: pipe.wait_for_frames() takes a timeout in milliseconds as an optional parameter (see here). So you can override the default 15s timeout with whatever makes sense to you.

Re 2: I might be wrong on this but the way I understand is when you reset the pipe, you are only reconnecting to the T265, you're not hard resetting the camera. In order to hard reset the T265 you'd have to call hardware_reset(). That means your camera will keep the last pose even after you've reset the pipe.

Re 3: Since the T265 didn't hard reset you shouldn't need to re-import the localization map.

Would be nice to have someone from Realsense chime in on this to make sure I'm not mistaken.

csr-kick commented 4 years ago

Thanks, forgot about the timeout. Make sure that you set it initially to a high value to allow the sensor boot up or whatever before dropping the timeout down for normal operations. Then in the catch statement put it back up. My current testing code:

rs2::log_to_console(RS2_LOG_SEVERITY_DEBUG); //verbose, can make this _INFO for less

rs2::config cfg;
cfg.enable_stream(RS2_STREAM_POSE, RS2_FORMAT_6DOF);
auto pipe = std::make_shared<rs2::pipeline>();
pipe->start(cfg);
unsigned int to = 20000;
while (true)
{
    try {

        auto frames = pipe->wait_for_frames(to);
        to = 40;
        auto f = frames.first_or_default(RS2_STREAM_POSE);
        // Cast the frame to pose_frame and get its data
        auto pose_data = f.as<rs2::pose_frame>().get_pose_data();

        // Print the x, y, z values of the translation, relative to initial position
        std::cout << "\r" << "Device Position: " << std::setprecision(3) << std::fixed << pose_data.translation.x << " " <<
            pose_data.translation.y << " " << pose_data.translation.z << " (meters)" << std::endl;
    }
    catch (const rs2::error& e) {
        to = 20000;
        std::cout << "T265 crashed. Restarting ..." << std::endl;
        pipe->stop();
        pipe = std::make_shared<rs2::pipeline>();
        pipe->start(cfg);
        std::cout << "... restarted." << std::endl;
    }
}

My testing shows that (without loading any maps right now) localization does go back to 0,0,0. In extended testing this has thrown a segfault after quite a while, typically when coming back up. Seems like a pose message from before the pipe.stop comes back after we restart and it doesn't like that.

@RealSenseSupport can we get any sort of statement about this problem?

csr-kick commented 4 years ago

Still working to see if there is some way to actually make the T265 work with the RPI.

I have tried setting up ubuntu 18 on the rpi, and same problem, T265 will eventually stop responding, and it happens in a matter of seconds with the D435 sending depth at 640x360x30fps.

@MartyG-RealSense please can we get some sort of response from intel on this issue.

MoBaT commented 4 years ago

Just wanted to put out there that I'm having the exact same problem and am also restarting the pipeline on failure. I'm on v2.34.1. Would love a fix to this @MartyG-RealSense

m4xr1sk commented 4 years ago

Same problem here with the SDK v2.35.2 I also would like to have a fix because otherwise this product is not usable at all.

alanypf commented 3 years ago

Hi all, I have the same problem. I am using Ubuntu 18.04 server. Librealsense is built from the source with the LTS kernel patch. The data link broke after a minute or so. The problem only exists on raspberry pi4 for me since if I plug the same SD card into a raspberry pi 3b+, the data links remain healthy. And on raspberry pi 4, neither USB2 nor USB3 ports work. I hope this helps the developer team.

Thanks.

RealSenseSupport commented 3 years ago

Thanks for highlighting this. At this time we have moved our focus to our next generation of products and consequentially will not be addressing this T265 issue.

dschnabel commented 3 years ago

@RealSenseSupport are you dropping support for the T265? I don't see any mention on https://www.intelrealsense.com/tracking-camera-t265

ArkadiuszNiemiec commented 3 years ago

Same issue here, I did not find an information about T265 not being supported.

Twentystudios commented 3 years ago

If T265 won’t be supported anymore (just bought one), what is the direct replacement for it?

yanboli commented 3 years ago

Required Info Camera Model T265 & D435 Firmware Version T265: 0.2.0.926, D435: 05.12.03.00 Operating System & Version Raspbian 10.3 Kernel Version (Linux Only) 4.19.97-v7l+ armv7l Platform Raspberry Pi 4 SDK Version 2.33.1 Language C++/ROS Segment Robot

Issue Description

After upgrading from SDK 2.31.0 to 2.33.1 my T265 stops sending pose data when running for some time (1-5 min). This happens if the T265 is run together with the D435 and also if I run the T265 alone (less frequently).

This wasn't a problem in 2.31.0 but I was experiencing crashes in 2.31.0 so I decided to upgrade to the latest stable version (2.33.1).

In 2.33.1 I see no more crashes but I seem to be experiencing a very similar problem as described in #5509 (comment).

I saw some discussion on missing serial numbers but in my case both cameras seem to get detected properly (see my ROS startup logs for T265 and D435).

When I monitor the odom topic from the T265, after a while I don't get any more pose (odom) data:

$ rostopic hz /odom

average rate: 200.380
  min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.380
  min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.380
  min: 0.000s max: 0.099s std dev: 0.01482s window: 50000
average rate: 200.315
  min: 0.000s max: 0.099s std dev: 0.01483s window: 50000
no new messages
no new messages
no new messages
no new messages
no new messages

For completeness here are my two ROS launch files:

Is there any fix for this?

Hi guys,

I encounter the very similar issue. On windows 10, the T265 lost reply in Waitforframe. the function never returns. the cpu usage drops to zero. And only a restart of the pipeline can wake up the T265. This problem appears from 60 - 800 seconds inadvertently.

But I may find out the solution. The root cause is "cout" or "printf" function. I cannot explain, but it works. You could remove all of your "cout", "printf" to the console. And this problem is gone. It solves my problem. And it may solve yours too.

My problem is due to the console window of Windows 10. Its default mode is Quick_Edit. the console pauses the program if you select any texts on the console screen.

Have fun,

csr-kick commented 3 years ago

But I may find out the solution. The root cause is "cout" or "printf" function. I cannot explain, but it works. You could remove all of your "cout", "printf" to the console. And this problem is gone. It solves my problem. And it may solve yours too.

My problem is due to the console window of Windows 10. Its default mode is Quick_Edit. the console pauses the program if you select any texts on the console screen.

Have fun,

Appreciate the thought, but its a bit different for different hardware. From my experience the root issue with the T265 is down to timing. There is a mismatch between the computer's USB messaging and the device's where the computer polls from the device, but the device doesn't notice. Then, the computer sits around waiting for a message that will never arrive, until it times out. Further supporting evidence is that bulk messages will still pass between the device and computer while it waits for the interrupt to return. I would imaging that using cout or printf for you is actually changing those timings, but I can confirm the problem persists on the rpi regardless. There are a few things you can do knowing this to improve results, but I have not been able to eliminate its occurrence, since we only have access to one side of the code.

Sadly, this looks to be one of those problems that wouldn't show up in the limited testing they would have put the T265 through before putting it on sale, and if we had access to the firmware could probably be solved by someone knowledgeable in the USB protocol. Lets hope this bug doesn't carry forward into future products.

RealSenseSupport commented 3 years ago

Hi @dschnabel

Curious if this pre-release build resolves your issue.

https://github.com/IntelRealSense/librealsense/releases/tag/v2.44.0

Particularly because it integrates this PR https://github.com/IntelRealSense/librealsense/pull/8561 that addresses a T265 specific race condition.

Thanks

dschnabel commented 3 years ago

Thanks @RealSenseSupport

As far as I can tell PR #8561 addresses cases where the pipeline stop procedure gets stuck. But the problem I have is that the T265 becomes unresponsive during normal operation even without pipeline.stop() being called.

So the PR seems unrelated to this issue. Or am I missing something?

Based on my investigation in https://github.com/IntelRealSense/librealsense/issues/6362#issuecomment-630577088 the problem seems to be deeper than the librealsense library, possibly a firmware or a libusb issue.

Twentystudios commented 3 years ago

@RealSenseSupport I would really appreciate some clarification regarding Intel dropping support for the T265. Is this really the case? It was released just two years ago and is still being sold. Will there be a direct replacement for it?

DevepNoName commented 3 years ago

https://github.com/IntelRealSense/librealsense/issues/8779

Realsense team should make this information ver noticeable and very clear (no support for arm devices), so people doesnt loose their time and money ...

nixinator commented 3 years ago

I'm using a pi4 with the T265 and having the exact same problem. I'm reading the threads and multiple issues here....intel are not doing fixes or support the T265. I think this is a 'we cannot fix it'. If it's not a problems with the software/hardware stack on the pi4, or the actual driver in the real-sense SDK, then it will require a firmware fix from intel.

We've got the T265 working fine on the Nvidia Xavior board, so i'm going to see what software differences there are between to the two systems. It may turn out this is something more subtle with the PI4's USB?

But, come on intel, the PI4 for robot development, don't leave us high and dry for 'newer expensive products'.

alas, many things that claim to be 'open source' , are not really open source at all, because of large binary blobs, which with out extensive reverse engineering cannot be fixed without intervention from humans that have access to build firmware. I can see why the kernel reports these blobs and unfree drivers as 'tainted'.

silverjoda commented 2 years ago

Hello, I had the same problem with my Pi4 and T265. What fixed it for me is going back to a 26 december 2019 github commit, 936a22bf1ca8231397dbedf2628cdd8b61c83a35, version: SDK 2.31. Using this older SDK version It doesn't cause the T265 to stop after a few minutes. I found out about this after digging for hours and seeing a mention of one of the devs to roll back.

In any case, I don't suggest wasting time with this sensor. It drifts and fails quite often. This would be fine if you could access internal information and tune it to your use case, but the algorithm is entirely closed, you don't know why it fails and there is nothing you can do about it. It's more or less a brick at this stage.

nixinator commented 2 years ago

intel + closed!

Business as usual.

sigh.

Construkted-Reality commented 2 years ago

@silverjoda what alternatives to the T265 do you know of. We don't have much choice which is why we are still trying to figure this out.

silverjoda commented 2 years ago

@silverjoda what alternatives to the T265 do you know of. We don't have much choice which is why we are still trying to figure this out.

To be honest I don't know. I want to try Orb-slam or some other monocular variant when I get the time.

zhouzhiwen2000 commented 2 years ago

@silverjoda what alternatives to the T265 do you know of. We don't have much choice which is why we are still trying to figure this out.

To be honest I don't know. I want to try Orb-slam or some other monocular variant when I get the time.

That requires quite a bit of CPU power.

siddharthcb commented 2 years ago

@dschnabel same issue here.. Sad to see this issue still floating even after an year. @RealSenseSupport There is no point selling the products if you cannot take the responsibility of fixing issues. Atleast putting an effort to solve is deeply appreciated.

nixinator commented 2 years ago

yep, open source can be a bit of finger pointing exercise, your firmware is incorrect, No! the rpi usb drivers are wrong! NO!, the debian package the usb subsystem is in error! No!, a kernel regression caused the problem! , NO the hardware has bug! and everyone points at everyone else!!! Upgrade to latest that will fix it! sigh! So, it's not as simple as that, i think if it was easy to fix, 'they' would. However, contacting 'they' , i.e. not a department but a 'person' who is responsible for is probably impossible. i'm not sure if that's by design or on purpose.

I could probably get the bottom of this, but i'm not going to do it for free, and if you factor in the cost of reverse engineering and getting into the weeds of all the subsystem involved, then it's probably just cheaper to upgrade to the latest and greatest piece of hardware, and take the T265 as a release hobbyist prototype that is restricted to to reduced number of hardware platforms and operating system versions.

we've decided to use other ways to localize the robot for now. If someone wants to offer a bounty to get this working on the rpi, then i am open to that and so will others here. Please DM me.

silverjoda commented 2 years ago

So what other way did you decide to localize the robot?:) On our big robots we use Ouster lidars with ALOAM and other mappers which works fine, but for small prototype robots I can't use any of those because there is no space or budget to put a full system.

yanboli commented 2 years ago

@RealSenseCustomerSupport @RealSense-Customer-Engineering @MartyG-RealSense

Can you help us? Is this a known issue and if so, is there a workaround or proposed fix? Any additional information we can collect for you too look at?

This problem renders the T265 unusable for our project.

Intel Says It’s Shuttering RealSense Camera Business. I guess there is no hope for T265 any more. Find another way ASAP.

https://www.crn.com/news/components-peripherals/intel-says-it-s-shuttering-realsense-camera-business

Construkted-Reality commented 2 years ago

What are the alternatives to affordable computer vision and depth camera systems like the ones from Intel?

MartyG-RealSense commented 2 years ago

I am now able to provide information about the future of RealSense, as an 'End of Life' notice has been released by Intel. Although some RealSense models are being discontinued, the 400 Series stereo depth cameras are continuing. The official PDF of the End of Life notice explaining the changes is attached below.

PCN118463-00.pdf

dschnabel commented 2 years ago

Thank you for this update @MartyG-RealSense.

As the T265 will be de-supported, can at least the source code of the firmware be made public? Even if Intel isn't interested in fixing this bug, many other in the community will be eager and capable of tackling this problem if we just had access to the source code.

MartyG-RealSense commented 2 years ago

Hi @dschnabel Whilst the RealSense SDK is open-source and developers are free to 'fork' it to create their own custom versions that they host on their own GitHub, Intel do not open-source RealSense's firmwares or camera algorithms.

nixinator commented 2 years ago

i think you just highlighted the problem... maybe i can ask 'do not', to 'to the reason why'?

MartyG-RealSense commented 2 years ago

@nixinator A reason given in the past has been that such resources are Intel's intellectual property.

nixinator commented 2 years ago

root cause analysis...complete!

zhouzhiwen2000 commented 2 years ago

@nixinator A reason given in the past has been that such resources are Intel's intellectual property.

If you can't release the source code, can you solve this issue before dropping support? Perhaps you could release part of the source code?

MartyG-RealSense commented 2 years ago

@zhouzhiwen2000 Development work for the T265 has ended so no further software or firmware updates can be provided for it.

csr-kick commented 2 years ago

@MartyG-RealSense Development work for the T265 has ended so no further software or firmware updates can be provided for it.

This is why no one will ever trust the depth sensors that are still around. Companies cannot base their products on a device where the creator has no 'anchor' product or vested interest in it's success and support. If you have an issue and wake up the next day to find that the supplier has decided to simply walk away rather than do the right thing for their customers, your existence is at risk while theirs is not.

HanDaSeul commented 2 years ago

I think I found the solution, which the T265 suddenly stops with RPi using ROS where the RPi is 8GB. The problem is neither the T265 nor the software(SDK, ROS whatever). It's the lack of power output for USB3.0 port on the RPi. I guess most of you guys plug at least two items to the RPi usb ports, especially wireless usb wifi adapter and theoretically, there will be no problem in case of power consumption, but it has. I think because of power consumption is too high so one of them tries disconnecting. Or you just using RPi's wifi chip, also you need to consider it.
Here's how did I make it: You need 1 powered usb hub. Some cheap powered usb hubs still have problems, but in my case i used 'ugreen usb 3.0 hub Model name: cr113' with external power. Plug all of USB devices to that usb hub, except t265. Only the T265 is connected to the RPi directly. I guess you can also plug to that usb hub but there's another problem 'when boot up you have to unplug and plug for working'. My T265 is directly connected to RPi using 'uhubctl' so I don't need to unplug for detection. My RPi's OS is gnome-session but not fully installed ubuntu-desktop. Realsense SDK version is 2.48, not 2.49. I tested how long can RPi handle without losing connection in both ROS and ROS2. On ROS2 my RPi's usb port have t265 and usb hub. And usb hub have wireless usb wifi adapter and RPLidar A1 and takes 5W from external power. But in ROS, I didn't use Lidar(I forgot it) and my usb hub takes 2.5W. I didn't try streaming cameras on both. The result is both are fine! In the ROS, still keeps connected after 50 minutes and ROS2, I just disconnected after 30 minutes. In ROS2 there's little bit of lag but I don't think that's related to the power. One more thing. Make sure keep T265 cool. When the T265 is hot, it seemed to lose connection. Hope it helps you guys!

20210912_175145 Above photo is the power consumption and connected devices testing in ROS2

Screenshot from 2021-09-10 17-11-45 Above screenshot is testing in ROS. See the elapsed time below.

HanDaSeul commented 2 years ago

And in the case of the problem unplug then plug after first boot up, why don't T265 try turn off then on itself when launch the node? For example, I saw RPLidar a1 keeps rotate and when I launch the node RPLidar a1 on ROS, it stops rotate for 1~2 seconds and rotate again. Even that Lidar connected to the usb hub, still works.

silverjoda commented 2 years ago

After plugging in and out about 200 times on the Rpi, I found out that you can use Uhubctl to automatically power cycle all the USB ports after boot, so that the t265 is recognized all the time.