Closed liangfok closed 3 years ago
I couldn't reproduce it yet. I will dig deeper.
I am also unable to reproduce the problem. Using the same machine as in the original post, I was able to load TShapeRoad.xodr:
Since I was unable to reproduce the problem, I will close this ticket for now.
I am also unable to reproduce the problem. Using the same machine as in the original post, I was able to load TShapeRoad.xodr:
Since I was unable to reproduce the problem, I will close this ticket for now.
Thanks for the update. Just wondering why the text isn't rendered as expected, the font size seems not to be right. Probably related to https://github.com/ToyotaResearchInstitute/delphyne-gui/issues/407
Are you referring to the text in the buttons on the right side of the screenshot? I so, I agree that is an issue.
Just encountered this when loading maliput_malidrive/maliput_malidrive/resources/Figure8.xodr
.
./install/delphyne_gui/bin/maliput_viewer2.sh --malidrive_backend=malidrive2
After specifying Figure8.xodr, the screen at first froze:
Then, after a while, the map appeared.
However, once I try to interact by rotating the view, it crashed:
terminate called after throwing an instance of 'Ogre::InternalErrorException'
what(): OGRE EXCEPTION(7:InternalErrorException): Vertex Buffer: Out of memory in GLHardwareVertexBuffer::lock at /build/ogre-1.9-B6QkmW/ogre-1.9-1.9.0+dfsg1/RenderSystems/GL/src/OgreGLHardwareVertexBuffer.cpp (line 124)
./install/delphyne_gui/bin/maliput_viewer2.sh: line 36: 21538 Aborted (core dumped) visualizer --layout=${DELPHYNE_GUI_RESOURCE_ROOT}/layouts/layout2_maliput_viewer.config "$@"
I confirm I can reproduce the bug.
This error points to that simply the video card runs out of memory. (VRAM). In my case, I am not able to reproduce it because I am not using a pc with a dedicated gpu and it seems that it is managed differently.
I can take a look at how we are rendering things in the scene and try to identify if there is a more proper way to do this. (Probably checking how ignition performs this).
Is there a chance that you @scpeters have seen this kind of issue before?
Is there a chance that you @scpeters have seen this kind of issue before?
I haven't seen this before
For reference, here are the specs of the video card I am using:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce 940MX Off | 00000000:02:00.0 Off | N/A |
| N/A 65C P0 N/A / N/A | 757MiB / 2004MiB | 52% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2330 G /usr/lib/xorg/Xorg 407MiB |
| 0 N/A N/A 2510 G /usr/bin/gnome-shell 128MiB |
| 0 N/A N/A 15247 G /usr/lib/firefox/firefox 0MiB |
| 0 N/A N/A 19849 G ...AAAAAAAAA= --shared-files 210MiB |
| 0 N/A N/A 20107 G /usr/lib/firefox/firefox 0MiB |
| 0 N/A N/A 20114 G /usr/lib/firefox/firefox 0MiB |
+-----------------------------------------------------------------------------+
Looks like it has 2GB of RAM.
In my case:
$ nvidia-smi
Thu May 27 16:33:37 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 Off | 00000000:01:00.0 Off | N/A |
| N/A 63C P3 19W / N/A | 1550MiB / 6072MiB | 8% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1305 G /usr/lib/xorg/Xorg 836MiB |
| 0 2608 G compiz 169MiB |
| 0 2940 G ...-token=50F682ABB3395D5372029391B31E4C00 101MiB |
| 0 20782 G ...-token=D422EDBD02F49F043FC6084E69295963 80MiB |
| 0 29007 G ...AAgAAAAAAAAACAAAAAAAAAA= --shared-files 359MiB |
+-----------------------------------------------------------------------------+
I ran into this bug trying to analyze straight_forward.xodr, which can be downloaded via the link below.
To reproduce, open the viewer by executing the command below, then loading straight_forward.xodr:
./install/delphyne_gui/bin/maliput_viewer2.sh --malidrive_backend=malidrive2
The following viewer shows up:
Left-click on a lane. The viewer will immediately crash with the following error:
terminate called after throwing an instance of 'Ogre::InternalErrorException'
what(): OGRE EXCEPTION(7:InternalErrorException): Vertex Buffer: Out of memory in GLHardwareVertexBuffer::lock at /build/ogre-1.9-B6QkmW/ogre-1.9-1.9.0+dfsg1/RenderSystems/GL/src/OgreGLHardwareVertexBuffer.cpp (line 124)
/home/liang/dev/maliput_ws/install/delphyne_gui/bin/maliput_viewer2.sh: line 36: 28035 Aborted (core dumped) visualizer --layout=${DELPHYNE_GUI_RESOURCE_ROOT}/layouts/layout2_maliput_viewer.config "$@"
It is 100% reproducible. The machine has relatively beefy NVIDIA GPUs:
$ nvidia-smi
Tue Jun 8 20:06:18 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) Off | 00000000:03:00.0 On | N/A |
| 40% 68C P0 68W / 250W | 1697MiB / 12194MiB | 12% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN X (Pascal) Off | 00000000:81:00.0 Off | N/A |
| 23% 37C P8 9W / 250W | 12MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3715 G /usr/lib/xorg/Xorg 72MiB |
| 0 4207 G /usr/bin/gnome-shell 114MiB |
| 0 16999 G /usr/lib/xorg/Xorg 678MiB |
| 0 17131 G /usr/bin/gnome-shell 489MiB |
| 0 32083 G ...AAgAAAAAAAAACAAAAAAAAAA= --shared-files 315MiB |
+-----------------------------------------------------------------------------+
The above error also happens if I use Malidrive 1.0:
maliput_viewer2.sh --malidrive_backend=opendrive_sdk
I can middle-click and drag to rotate the visualization, and scroll the mouse to zoom in/out. It only crashes when I left-click on a lane.
I confirm it also happens with maliput_multilane.
Checking this issue I noticed what I logged here: https://github.com/ToyotaResearchInstitute/delphyne_gui/issues/407#issuecomment-857640495
I suspect there is a misconfiguration of the nvidia drivers. Google search on Ogre exceptions as this one points in that direction as well.
I tried the export LIBGL_ALWAYS_SOFTWARE=1
technique described here but that did not work. Same error when I click on a lane:
terminate called after throwing an instance of 'Ogre::InternalErrorException'
what(): OGRE EXCEPTION(7:InternalErrorException): Vertex Buffer: Out of memory in GLHardwareVertexBuffer::lock at /build/ogre-1.9-B6QkmW/ogre-1.9-1.9.0+dfsg1/RenderSystems/GL/src/OgreGLHardwareVertexBuffer.cpp (line 124)
/home/liang/dev/maliput_ws/install/delphyne_gui/bin/maliput_viewer2.sh: line 36: 26980 Aborted (core dumped) visualizer --layout=${DELPHYNE_GUI_RESOURCE_ROOT}/layouts/layout2_maliput_viewer.config "$@"
I did try it this morning and got the same result as you.
This post indicates that rviz
has a '-l' command line option that results in an Ogre.log file, which may contain useful information when Ogre 3D crashes. I wonder if we could get a similar log from Delphyne GUI's visualizer.
I did check the ogre log (see inside the docker container at /home/username/.ignition/ogre.log
) but could not find anything relevant.
I found an error in the NVidia dockerfile: https://github.com/ToyotaResearchInstitute/maliput_infrastructure/pull/210/files#diff-2cbc4ed79e6f64d9c41f1a01b9600e73c3fc34f9b735bdd4826703ade62e17acL48-L49 and https://github.com/ToyotaResearchInstitute/maliput_infrastructure/blob/main/docker/Dockerfile.nvidia#L20 . I will send a PR to fix but it did not fix the bug here.
Thanks, I also checked my ~/.ignition/rendering/ogre.log
and found nothing useful -- it just logged the same exception.
Updated title of ticket to reflect the fact that it happens using multiple backends.
I have built a workspace with ign libraries from sources and I have downgraded ign-gui3 from 3.5.1 to 3.5.0. Same result.
Running in bionic.
BTW, it's important to point out that the error happens when running the RayTracing. I can move the view around, zoom in and out but when I click on the mesh it fails.
I have also checked ign-rendering 3.4.0 without any luck. Note that ign-rendering 3.5.0 was released in Tue May 25
.
The non-nvidia docker image does not work out of the box. ign-gui / rendering fails with glx errors.
Next step: try ign samples to verify the same behavior.
Next step: try ign samples to verify the same behavior.
I checked it and it is working just fine with default libraries from debian packages. Will start removing stuff from the maliput_viewer to identify what is causing the error because there is no need now to think there is something else.
Next step: try ign samples to verify the same behavior.
I checked it and it is working just fine with default libraries from debian packages. Will start removing stuff from the maliput_viewer to identify what is causing the error because there is no need now to think there is something else.
I would try disabling from the code the part where we do raytracing to get the lane selected. To see if it is related to that.
I ran with gdb the application and found the following:
Thread 1 "visualizer" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) backtrace
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007ffff5c9d921 in __GI_abort () at abort.c:79
#2 0x00007ffff62f2957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff62f8ae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff62f8b21 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff62f8d54 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fffbe51aaca in Ogre::GLHardwareVertexBuffer::lockImpl(unsigned long, unsigned long, Ogre::HardwareBuffer::LockOptions) ()
from /usr/lib/x86_64-linux-gnu/OGRE-1.9.0/RenderSystem_GL.so
#7 0x00007fffbe516236 in ?? () from /usr/lib/x86_64-linux-gnu/OGRE-1.9.0/RenderSystem_GL.so
#8 0x00007fffccfd9cb2 in ignition::rendering::v3::OgreRayQuery::MeshInformation(Ogre::Mesh const*, unsigned long&, Ogre::Vector3*&, unsigned long&, unsigned long*&,
ignition::math::v6::Vector3<double> const&, ignition::math::v6::Quaternion<double> const&, ignition::math::v6::Vector3<double> const&) () from /usr/lib/x86_64-linux-gnu/ign-rendering-3/engine-plugins/libignition-rendering-ogre.so
#9 0x00007fffccfda979 in ignition::rendering::v3::OgreRayQuery::ClosestPoint() ()
from /usr/lib/x86_64-linux-gnu/ign-rendering-3/engine-plugins/libignition-rendering-ogre.so
#10 0x00007fffbc0b9174 in delphyne::gui::MaliputViewerPlugin::ScreenToScene (this=this@entry=0x5555564329d0, _screenX=468,
_screenY=<optimized out>)
at /home/agalbachicar/maliput_ws/src/delphyne_gui/delphyne_gui/visualizer/maliput_viewer_plugin/maliput_viewer_plugin.cc:723
#11 0x00007fffbc0c205a in delphyne::gui::MaliputViewerPlugin::MouseClickHandler (this=0x5555564329d0,
_mouseEvent=_mouseEvent@entry=0x7fffffffc600)
at /home/agalbachicar/maliput_ws/src/delphyne_gui/delphyne_gui/visualizer/maliput_viewer_plugin/maliput_viewer_plugin.cc:591
#12 0x00007fffbc0c3c22 in delphyne::gui::MaliputViewerPlugin::eventFilter (this=0x5555564329d0, _obj=0x555556422d60,
_event=0x7fffffffc600)
at /home/agalbachicar/maliput_ws/src/delphyne_gui/delphyne_gui/visualizer/maliput_viewer_plugin/maliput_viewer_plugin.cc:571
#13 0x00007ffff6ae15bc in QCoreApplicationPrivate::sendThroughObjectEventFilters(QObject*, QEvent*) ()
from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007ffff6ae16af in QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x00007ffff6ae176a in QCoreApplication::notify(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007ffff6ae18d8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#17 0x00007ffff47c48e0 in QQuickWindow::sendEvent(QQuickItem*, QEvent*) () from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#18 0x00007ffff47c7f71 in QQuickWindowPrivate::deliverMatchingPointsToItem(QQuickItem*, QQuickPointerEvent*, QSet<QQuickItem*>*) ()
from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#19 0x00007ffff47c897a in QQuickWindowPrivate::deliverPressEvent(QQuickPointerEvent*, QSet<QQuickItem*>*) ()
from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#20 0x00007ffff47c8d39 in QQuickWindowPrivate::deliverMouseEvent(QQuickPointerMouseEvent*) ()
from /usr/lib/x86_64-linux-gnu/libQt5Quick.so.5
#21 0x00007ffff47c95d5 in QQuickWindowPrivate::deliverPointerEvent(QQuickPointerEvent*) ()
When inspecting how things are done between ignition::gui::Scene3D
plugin and our MaliputViewerPlugin
to handle the RayQuery
query I identified that in the former it runs in a QThread
which is bound to the OpenGL context running behind the scenes. That is not our case. I could also verify it by commenting out the call to MouseClickHandler(mouseEvent).
I'll investigate how easy is to execute a task in an OpenGL context through ignition without breaking the rendering pipeline that is already available in ign-gui Scene3D plugin.
@scpeters I think you might be interested in passing this information down to ignition gui and rendering folks.
I think this is related to https://github.com/ToyotaResearchInstitute/delphyne_gui/issues/393; I guess I forgot to come back to this
I think we should cache the mouse event data and wait for the Render
event callback to call ScreenToScene
I think we should cache the mouse event data and wait for the
Render
event callback to call ScreenToScene
Yes, yesterday we discussed this with @agalbachicar and we believe it is related to it. He will install by source and try these new changes in ign-gui3
that aren't released yet. By the way, a released for ign-gui3 is planned to happen this week so we won't need to specially ask for it.
I think we should cache the mouse event data and wait for the
Render
event callback to call ScreenToSceneYes, yesterday we discussed this with @agalbachicar and we believe it is related to it. He will install by source and try these new changes in
ign-gui3
that aren't released yet. By the way, a released for ign-gui3 is planned to happen this week so we won't need to specially ask for it.
I had previously looked at the code that was merged, and there is one bit of functionality that is missing. The *ClickToScene events provide a Vector3d
but don't provide the distance to the object or a bool
indicating whether an object was clicked on (see discussion in https://github.com/ignitionrobotics/ign-gui/issues/209#issuecomment-849808255). Unfortunately, these data structures were added in https://github.com/ignitionrobotics/ign-gui/pull/148 and have already been released, so we can't change them. I can look into adding a separate data structure alongside it with the extra info, which is not very clean, but I will check with Louise about what to do.
We currently use the knowledge of whether something was clicked or not, and deselect if nothing was clicked, so I think that extra data field is important.
I think we should cache the mouse event data and wait for the
Render
event callback to call ScreenToSceneYes, yesterday we discussed this with @agalbachicar and we believe it is related to it. He will install by source and try these new changes in
ign-gui3
that aren't released yet. By the way, a released for ign-gui3 is planned to happen this week so we won't need to specially ask for it.I had previously looked at the code that was merged, and there is one bit of functionality that is missing. The *ClickToScene events provide a
Vector3d
but don't provide the distance to the object or abool
indicating whether an object was clicked on (see discussion in ignitionrobotics/ign-gui#209 (comment)). Unfortunately, these data structures were added in ignitionrobotics/ign-gui#148 and have already been released, so we can't change them. I can look into adding a separate data structure alongside it with the extra info, which is not very clean, but I will check with Louise about what to do.We currently use the knowledge of whether something was clicked or not, and deselect if nothing was clicked, so I think that extra data field is important.
Good point.
This is what is returned when the cast doesn't intersects an object: (From here)
// Set point to be 10m away if no intersection found
return this->dataPtr->rayQuery->Origin() +
this->dataPtr->rayQuery->Direction() * 10;
I wonder if we could use that info to know if a lane was hit or not.
Good point.
This is what is returned when the cast doesn't intersects an object: (From here)
// Set point to be 10m away if no intersection found return this->dataPtr->rayQuery->Origin() + this->dataPtr->rayQuery->Direction() * 10;
I wonder if we could use that info to know if a lane was hit or not.
we call GetLaneFromWorldPosition using the Vector3d
from the event. I imagine we could call Lane::ToLanePosition
and see if the lane h
coordinate is close to 0
? It's kind of a hack but might work
FYI, RoadGeometry::ToRoadPosition() already has that logic (minimize h) as one of the decision rules.
I think we should cache the mouse event data and wait for the
Render
event callback to call ScreenToSceneYes, yesterday we discussed this with @agalbachicar and we believe it is related to it. He will install by source and try these new changes in
ign-gui3
that aren't released yet. By the way, a released for ign-gui3 is planned to happen this week so we won't need to specially ask for it.
the ign-gui 3.6.0 release is imminent https://github.com/ignitionrobotics/ign-gui/pull/233
Good point. This is what is returned when the cast doesn't intersects an object: (From here)
// Set point to be 10m away if no intersection found return this->dataPtr->rayQuery->Origin() + this->dataPtr->rayQuery->Direction() * 10;
I wonder if we could use that info to know if a lane was hit or not.
we call GetLaneFromWorldPosition using the
Vector3d
from the event. I imagine we could callLane::ToLanePosition
and see if the laneh
coordinate is close to0
? It's kind of a hack but might work
I've started testing this approach with ign-gui3 3.6.0 debs in https://github.com/ToyotaResearchInstitute/delphyne_gui/commit/eb947ea72c22211087646e12a29eff88d606f758. I'm currently running into a bunch of seg-faults related to the MouseHover action, which I think is a bug in ign-gui3 (https://github.com/ignitionrobotics/ign-gui/issues/209#issuecomment-864377926)
I sent #433 which solves the problem here but exposes the problem in ign-gui that @scpeters commented in https://github.com/ToyotaResearchInstitute/delphyne_gui/issues/405#issuecomment-864378488
I just got a new laptop with nvidia gpu and it is interesting that I wasn't able to reproduce the error:
franco@a2d19cf4ac08:~/maliput_ws$ nvidia-smi
Mon Jul 5 19:51:07 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1650 Off | 00000000:01:00.0 On | N/A |
| N/A 48C P5 8W / N/A | 75MiB / 3911MiB | 37% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Resuming the work on this issue. The error is still reproducible in bionic with an updated graphic card driver for me:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1060 Off | 00000000:01:00.0 Off | N/A |
| N/A 58C P5 10W / N/A | 412MiB / 6078MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
By trying https://github.com/ToyotaResearchInstitute/delphyne_gui/pull/446 and https://github.com/ToyotaResearchInstitute/delphyne_gui/pull/423 together in foxy, I could not reproduce this bug using the maliput_viewer2. That points in the direction that this is a bionic specific bug and we could get rid of it by migrating to focal. Another reason to move faster.
@francocipollone confirmed this error is not reproducible in focal. I'm closing this ticket in favor of using all delphyne repos in focal and deprecating its use in bionic.
Steps to reproduce:
./install/delphyne_gui/bin/maliput_viewer2.sh --malidrive_backend=malidrive2
Application should immediately crash with:
This is on a Lenovo T460p laptop running Ubuntu 18.04 with a GeForce 940MX GPU.