AcademySoftwareFoundation / openvdb

OpenVDB - Sparse volume data structure and tools
http://www.openvdb.org/
Mozilla Public License 2.0
2.64k stars 653 forks source link

OpenVDB 6.2 crashes on startup on Focal #732

Closed SteveMacenski closed 4 years ago

SteveMacenski commented 4 years ago

Hi,

Per https://github.com/SteveMacenski/spatio_temporal_voxel_layer/issues/167 and https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=951704, there's an issue with loading OpenVDB 6.2 in Ubuntu Focal due to the dependency on jemalloc. It has been reproduced by multiple users and traced back to OpenVDB.

There's a note that it needs to be compiled with --disable-initial-exec-tls to make things work properly

danrbailey commented 4 years ago

Hi @SteveMacenski, thanks for letting us know. We mainly use the version of Jemalloc that ships with SideFX's Houdini which is 3.6.0. I think we should update the dependencies table in our documentation to include that recommendation as it doesn't look like jemalloc is listed there.

We don't compile or ship jemalloc binaries and don't support the Debian builds of OpenVDB, so it sounds like this is an issue for the package maintainers to add this compile flag?

SteveMacenski commented 4 years ago

don't support the Debian builds of OpenVDB, so it sounds like this is an issue for the package maintainers to add this compile flag?

If the maintainers on this project don't release OpenVDB binaries to apt, who does? Its also not entirely clear to me where the change needs to be made; openvdb flags for binaries or jemalloc flags for binaries (or both). There's been discussion on the bug report that hasn't gained any traction. Given Focal (20.04) is the new LTS in Ubuntu-land, I figure whoever releases this software would have motivation to make it so it actually works when someone installs it.

Currently, apt installs of OpenVDB are non-functional.

Idclip commented 4 years ago

This may be related https://jira.aswf.io/browse/OVDB-134 You could try the LD_PRELOAD solution as a temporary work around. Most likely the easiest way to report a bug with apt is with https://www.debian.org/Bugs/Reporting - The CMake build of VDB links to jemalloc by default (if available), but you can choose tbbmalloc or the system default with -DCONCURRENT_MALLOC=Tbbmalloc or -DCONCURRENT_MALLOC=None

danrbailey commented 4 years ago

The package uploader is listed on the ubuntu openvdb page:

https://launchpad.net/ubuntu/+source/openvdb/6.2.1-8ubuntu1 https://launchpad.net/ubuntu/+source/openvdb/7.0.0-3ubuntu1

I note there's a comment about jemalloc on the latest 7.0 release:

Remove dependency to jemalloc. See #951704 for details

doisyg commented 4 years ago

I am on Ubuntu Focal and have the same issue. So I manually installed libopenvdb-dev_7.0.0 and libopenvdb7.0 from Groovy (https://launchpad.net/ubuntu/+source/openvdb/7.0.0-3ubuntu1/+build/19221900) and removed libopenvdb6.2 and libopenvdb-dev (I know, dirty). Then I recompiled https://github.com/SteveMacenski/spatio_temporal_voxel_layer/tree/melodic-devel and the issue disappeared. So one solution could be to release openvdb7 to focal

doisyg commented 4 years ago

https://answers.launchpad.net/ubuntu/+source/openvdb/+question/691239

danrbailey commented 4 years ago

Good to know. It would definitely be worth upgrading to 7.0 regardless. However, worth pointing out that if the solution using 7.0.0 was to eliminate jemalloc as a dependency, that could mean OpenVDB was built without using a concurrent allocator which will significantly reduce performance. Can someone explain what the number in this comment refers to:

Remove dependency to jemalloc. See #951704 for details

Is that a bug or a commit id? Can someone share a link?

As @Idclip mentions in that linked Jira - if the version of Python included in this distribution is >= 3.6, there is another potential solution:

  • Share symbols across extension libraries (only works from python 3.6)

import sys, dl sys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL) import pyopenvdb

SteveMacenski commented 4 years ago

I don't think @doisyg set those flags (or didn't say he did)

doisyg commented 4 years ago

As @Idclip mentions in that linked Jira - if the version of Python included in this distribution is >= 3.6, there is another potential solution:

  • Share symbols across extension libraries (only works from python 3.6)

import sys, dl sys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL) import pyopenvdb

We are not using python but cpp. The LD_PRELOAD trick works though

mhampl commented 4 years ago

Remove dependency to jemalloc. See #951704 for details

Is that a bug or a commit id? Can someone share a link?

This is a reference to the Debian bug tracker https://bugs.debian.org/951704

SteveMacenski commented 4 years ago

The LD_PRELOAD also isn't really a solution - but its good to know that something hacky can be done in the meantime for development.

danrbailey commented 4 years ago

Thanks for sharing the link. Just to be sure we're all on the same page here - it sounds like the proper solution to this falls to changing the jemalloc package in Debian to include the --disable-initial-exec-tls flag, is that correct?

We should also add a comment highlighting this as a known issue in our build documentation until this issue gets resolved.

SteveMacenski commented 4 years ago

That is correct to my understanding.

SteveMacenski commented 4 years ago

@danrbailey any update on this / documentation around it?

danrbailey commented 4 years ago

Hi Steve,

Yes, we are working towards removing jemalloc as a dependency linked into the OpenVDB core shared library:

https://github.com/AcademySoftwareFoundation/openvdb/pull/749

It will then be up to host applications to choose to include a concurrent allocator. This is the more correct solution and should resolve this issue without the need for including this --disable-initial-exec-tls flag. We're aiming for this fix to go into the upcoming 7.1.0 release of OpenVDB. Hopefully the LD_PRELOAD workaround mentioned earlier in this thread is an acceptable work-around until then.

Thanks for bringing the issue to our attention, we'll post back here when this PR has been merged.

Idclip commented 4 years ago

Merged - jemalloc/tbbmalloc remain optional configurations for the openvdb binaries but are no longer linked into other build artefacts (i.e. libs)

SteveMacenski commented 4 years ago

@Idclip I don't see any binary updates available

Idclip commented 4 years ago

@SteveMacenski by binaries, I was referring to the configuring/building of the vdb command line executable's. We currently don't provide pre-built binaries as part of the VDB release/update cycle.

SteveMacenski commented 4 years ago

Got it. It's still not clear to me what organization releases the binaries and how to get in contact with them to do an update to fix these issues for users.

Idclip commented 4 years ago

Yeah sorry, I'm not too familiar with this process either. You'll have to request an update to the version of VDB available on Focal. It seems that @doisyg has already requested this and it was ruled out? https://answers.launchpad.net/ubuntu/+source/openvdb/+question/691239

As @doisyg also mentioned, version 7.0.0 on groovy has been fixed to solve this issue: https://launchpad.net/ubuntu/+source/openvdb/7.0.0-3ubuntu1

I could be wrong, but it seems like this documentation is what you're after: https://wiki.ubuntu.com/UbuntuDevelopment/NewPackages

frarodfo commented 3 years ago

From a beginner in Ubuntu: I installed pyopenvdb using apt install, and I ran into the error mentioned ImportError: /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block when importing pyopenvdb in python.

In this thread I read that a solution will be coming soon, but in the meantime, there is at least one workaround using LD_PRELOAD.

Could we have a brief overview on how to implement this workaround for a beginner? This page is one of the top results when searching online for this error, so writing an easy to follow workaround here might be very helpful to other users. Thank you very much.

Edit: I managed to do it. Here I write a step by step solution for others who see this: The main step by step is explained here https://jira.aswf.io/browse/OVDB-134 and this is how I did it:

In the linux console, type the command

export LD_PRELOAD=/path/to/jemalloc.so

note that "/path/to/jemalloc.so" has to be changed to the actual path were the file jemalloc.so is located. Fortunately, this path is exactly the one given in the ImportError that you get in Python. In my case, I searched for the file and I did not have a jemalloc.so file, but instead had a libjemalloc.so.2 file, as seen in my original error message. I located the path to the file. Therefore, I typed in the console:

export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

After doing this, importing pyopenvdb in python3 no longer gives an error. Thank you very much everyone for the workaround!