Closed koshkabb closed 4 years ago
If it helps, I executed the problematic code with gdb python and got the following:
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007f15212e6a54 in std::ctype<wchar_t>::do_scan_is (this=0x7ffcea58daf0, __m=16672, __lo=0x20 <error: Cannot access memory at address 0x20>,
__hi=0x46 <error: Cannot access memory at address 0x46>) at ctype_members.cc:188
188 ctype_members.cc: No such file or directory.
I can't reproduce this. What version of python and matplotlib are you using? Make sure you don't have multiple copies of molgrid installed.
In [1]: import matplotlib
In [2]: import molgrid
In [3]: e = molgrid.ExampleProvider()
In [4]: e.populate('single.types')
In [5]: e.next()
Out[5]: <molgrid.molgrid.Example at 0x7fb0e91685b0>
I am using Python 3.7.3 and the version of matplotlib is 3.2.1. (Previously I was using matplotlib 3.1.0 but I updated it thinking this might be the issue, but it wasn't). There is only one molgrid installed. :S
Can you provide the full stack trace of the segfault?
Hello!
Yes, here it is. Forgot to mention I am running it in a docker container with Ubuntu 18.04.
#0 0x00007f15212e6a54 in std::ctype<wchar_t>::do_scan_is (this=0x7ffcea58daf0, __m=16672, __lo=0x20 <error: Cannot access memory at address 0x20>,
__hi=0x46 <error: Cannot access memory at address 0x46>) at ctype_members.cc:188
#1 0x00007f1521321a27 in std::__cxx11::numpunct<char>::grouping (this=0x7f1521c24120 <(anonymous namespace)::ctype_w>)
at /tmp/gcc-6.2.0_build/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:1777
#2 std::__facet_shims::__numpunct_fill_cache<char> (f=f@entry=0x7f1521c24120 <(anonymous namespace)::ctype_w>, c=c@entry=0x55ce1e41bec0)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++11/cxx11-shim_facets.cc:514
#3 0x00007f15212c3343 in std::__facet_shims::(anonymous namespace)::numpunct_shim<char>::numpunct_shim (c=0x55ce1e41bec0, f=0x7f1521c24120 <(anonymous namespace)::ctype_w>, this=0x55ce1f3718f0)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++11/cxx11-shim_facets.cc:238
#4 std::locale::facet::_M_cow_shim (this=this@entry=0x7f1521c24120 <(anonymous namespace)::ctype_w>, which=<optimized out>)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++11/cxx11-shim_facets.cc:797
#5 0x00007f15212b52fc in std::locale::_Impl::_M_install_facet (this=this@entry=0x55ce1f39d4d0, __idp=<optimized out>, __fp=0x7f1521c24120 <(anonymous namespace)::ctype_w>)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/locale.cc:384
#6 0x00007f15212b546e in std::locale::_Impl::_M_replace_facet (this=this@entry=0x55ce1f39d4d0, __imp=__imp@entry=0x7f1521c24da0 <(anonymous namespace)::c_locale_impl>, __idp=<optimized out>)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/locale.cc:308
#7 0x00007f15212b54a7 in std::locale::_Impl::_M_replace_category (this=0x55ce1f39d4d0, __imp=0x7f1521c24da0 <(anonymous namespace)::c_locale_impl>,
__idpp=0x7f1521b5a730 <std::locale::_Impl::_S_id_numeric+16>) at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/locale.cc:297
#8 0x00007f15212b3cec in std::locale::_Impl::_M_replace_categories (this=this@entry=0x55ce1f39d4d0, __imp=0x7f1521c24da0 <(anonymous namespace)::c_locale_impl>, __cat=__cat@entry=2)
at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/localename.cc:336
#9 0x00007f15212b3e9b in std::locale::_M_coalesce (this=this@entry=0x7ffcea58dcd0, __base=..., __add=..., __cat=__cat@entry=2) at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/localename.cc:166
#10 0x00007f15212b3f59 in std::locale::locale (this=0x7ffcea58dcd0, __base=..., __s=<optimized out>, __cat=2) at ../../../../../gcc-6.2.0/libstdc++-v3/src/c++98/localename.cc:151
What is your locale?
The output of locale is:
LANG=en_US.UTF-8
LANGUAGE=en_US.en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
Actually, I don't know if it's related or if it helps, but I am having the same problem as the issue described in https://github.com/gnina/libmolgrid/issues/29 when single.types contains 0 protein.pdb ligand.sdf
(with a label):
>>> import matplotlib
>>> import molgrid
>>> e = molgrid.ExampleProvider()
>>> e.populate('single.types')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Missing molecular data in line: 0 protein.pdb ligand.sdf
If single.types contains just protein.pdb ligand.sdf
(no label), then there is no error in e.populate, but I get the segmentation fault in e.next().
Can you provide a container that reproduces this problem?
Hi, yes, I've created a very simple dockerfile (can't upload it here, so I'm pasting the text):
FROM nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-pip \
wget
RUN pip3 install molgrid
RUN pip3 install matplotlib
RUN pip3 install torch
RUN wget https://files.rcsb.org/download/6HJ2.pdb
RUN wget http://files.rcsb.org/ligands/view/P06_ideal.sdf
RUN echo "6HJ2.pdb P06_ideal.sdf" > single.types
When you access its terminal, just run python3 and:
import matplotlib
import molgrid
e = molgrid.ExampleProvider()
e.populate('single.types')
e.next()
I was just able to narrow it down even more to kiwisolver
(matplotlib imports kiwisolver), which doesn't import anything else:
>>> import kiwisolver
>>> import molgrid
>>> e = molgrid.ExampleProvider()
>>> e.populate('single.types')
>>> example = e.next()
Segmentation fault
Hello, I hope you had a nice weekend. Just wanted to let you know that I've found a "silly" way of fixing the issue. If I do from torch import cuda
, the segmentation fault doesn't happen:
>>> import kiwisolver
>>> from torch import cuda
>>> import molgrid
>>> fname = 'single.types'
>>> e = molgrid.ExampleProvider()
>>> e.populate(fname)
>>> e.next()
<molgrid.molgrid.Example object at 0x7ff59b1c0cb0>
Maybe this gives you a clue as to why the segmentation fault happened :S
My assumption this is due to how the pip package is built where we are bundling in older libraries (including libstdc++) to maximize portability. There is probably some interaction between the bundled stdc++ and the system stdc regarding locales, but I haven't been able to figure it out yet.
Do you still have the problem if you build from source?
I've tried building from source but I got stuck - it can't find the openbabel header files..
Okay, I think I may have figured it out. We need to statically link libstdc++ to have a manylinux compliant image since manylinux doesn't let you use a c++14 compatible libstdc++. However, it ends up conflicting with the dynamically loaded libstdc++ that kiwisolver pulls in, so it is necessary to hide the externally exported symbols from the static libstdc++.
Can you try testing out this version: https://test.pypi.org/project/molgrid/0.1.2/
Thanks a lot for the explanation! I feel like now I understand it a bit more :)
That version fixes it! Amazing! I'll keep an eye out for when it goes live.
I've pushed a release - should be available through regular pip now. This release also adds an iterator interface for ExampleProvider (but be careful - it will iterate forever - next release I'll add a concept of epochs).
Awesome, thank you so much !!
Hello again!
I have a problem in which the next() method of ExampleProvider is causing a Segmentation Fault. After digging a bit into it, I found out that this happens if matplotlib (or something else that matplotlib is importing) is imported before molgrid:
This doesn't happen if the order of the imports changes:
I was just wondering if you have come by this issue or a similar one before. Thank you!