Georgetown-IR-Lab / QuickUMLS

System for Medical Concept Extraction and Linking
MIT License
369 stars 95 forks source link

[BUG] Issue with installing simstring #63

Open kr-hansen opened 3 years ago

kr-hansen commented 3 years ago

Describe the bug Simstring doesn't seem to install correctly. When installing quickUMLS, it seems the simstring module is empty with this error: Traceback (most recent call last): File "install.py", line 199, in <module> driver(opts) File "install.py", line 171, in driver parse_and_encode_ngrams(mrconso_iterator, simstring_dir, cuisty_dir) File "install.py", line 106, in parse_and_encode_ngrams ss_db = SimstringDBWriter(simstring_dir) File "/home/u0562171/gitclones/QuickUMLS/toolbox.py", line 166, in __init__ self.db = simstring.writer( AttributeError: module 'simstring' has no attribute 'writer'

To Reproduce Follow all install steps but installed and setup from within a miniconda environment. Install of Python, requirements, spacy and UMLS all seemed to go well. Using gcc and g++ installed with apt-get on Ubuntu. Run simstring install step for python 3 and get some warnings included below, but seems to install correctly?

Environment

Additional context Here are the warnings/output from running simstring install.

https://github.com/Georgetown-IR-Lab/simstring/archive/1.1.3.tar.gz Getting Simstring 1.1.3... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 134 100 134 0 0 864 0 --:--:-- --:--:-- --:--:-- 864 100 67129 0 67129 0 0 74753 0 --:--:-- --:--:-- --:--:-- 134k Unpacking Simstring... Making Simstring... running build_ext building '_simstring' extension creating build creating build/temp.linux-x86_64-3.8 gcc -pthread -B /home/u0562171/miniconda3/envs/nlp/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/u0562171/miniconda3/envs/nlp/include/python3.8 -c export.cpp -o build/temp.linux-x86_64-3.8/export.o cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from export.cpp:8: ./simstring/simstring.h: In member function ‘bool simstring::reader::open(const string&)’: ./simstring/simstring.h:838:18: warning: variable ‘num_entries’ set but not used [-Wunused-but-set-variable] 838 | uint32_t num_entries, max_size; | ^~~~~~~~~~~ ./simstring/simstring.h: In member function ‘bool simstring::reader::check(const string_type&, double)’: ./simstring/simstring.h:1020:50: warning: typedef ‘char_type’ locally defined but not used [-Wunused-local-typedefs] 1020 | typedef typename string_type::value_type char_type; | ^~~~~~~~~ export.cpp: In destructor ‘virtual writer::~writer()’: export.cpp:121:41: warning: throw will always call terminate() [-Wterminate] 121 | throw std::runtime_error(message); | ^ export.cpp:121:41: note: in C++11 destructors default to noexcept export.cpp:135:41: warning: throw will always call terminate() [-Wterminate] 135 | throw std::runtime_error(message); | ^ export.cpp:135:41: note: in C++11 destructors default to noexcept gcc -pthread -B /home/u0562171/miniconda3/envs/nlp/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/u0562171/miniconda3/envs/nlp/include/python3.8 -c export_wrap.cpp -o build/temp.linux-x86_64-3.8/export_wrap.o cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /usr/include/string.h:495, from /home/u0562171/miniconda3/envs/nlp/include/python3.8/Python.h:30, from export_wrap.cpp:154: In function ‘char* strncpy(char*, const char*, size_t)’, inlined from ‘void SWIG_Python_FixMethods(PyMethodDef*, swig_const_info*, swig_type_info**, swig_type_info**)’ at export_wrap.cpp:9059:22, inlined from ‘PyObject* PyInit__simstring()’ at export_wrap.cpp:9157:25: /usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:34: warning: ‘char* __builtin_strncpy(char*, const char*, long unsigned int)’ output truncated before terminating nul copying 10 bytes from a string of the same length [-Wstringop-truncation] 106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); | ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ g++ -pthread -shared -B /home/u0562171/miniconda3/envs/nlp/compiler_compat -L/home/u0562171/miniconda3/envs/nlp/lib -Wl,-rpath=/home/u0562171/miniconda3/envs/nlp/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.8/export.o build/temp.linux-x86_64-3.8/export_wrap.o -o /home/u0562171/gitclones/QuickUMLS/simstring-1.1.3/_simstring.cpython-38-x86_64-linux-gnu.so Installing... Done!

kr-hansen commented 3 years ago

For anyone else coming across this, I was able to solve it by just installing simstring as described here (http://www.chokkan.org/software/simstring/).

I followed line by line the steps in "setup_simstring.sh," including downloading the compressed version hosted on the QuickUMLS webpage. However, after the "build_ext" step in the script, I also ran the "python setup.py install" command outlined on the simstring website. This fully installed simstring, rather than just trying to use the additional files, but that version of simstring worked while the build_ext version never seemed to work for me.

Unsure why the script isn't working as described, but I finally got it running once I fully installed simstring.