Closed tarunallyo closed 2 years ago
@tarunallyo sorry for the inconvenience here, this seems quite similar issue to https://github.com/google/sentencepiece/issues/378
following the install from source instructions would work: https://github.com/google/sentencepiece#c-from-source
we'll add details to our docs on it soon
feel free to follow up on this if you hit further issues
I was able to build the python wheels for sentencepiece with the following change:
diff --git a/python/make_py_wheel.sh b/python/make_py_wheel.sh
index 7f82947..5a2a4f7 100755
--- a/python/make_py_wheel.sh
+++ b/python/make_py_wheel.sh
@@ -73,7 +73,7 @@ build() {
if [ "$1" = "native" ]; then
build $2
elif [ "$#" -eq 1 ]; then
- run_docker quay.io/pypa/manylinux1_${1} ${1}
+ run_docker quay.io/pypa/manylinux2014_${1} ${1}
else
run_docker quay.io/pypa/manylinux1_i686 i686
run_docker quay.io/pypa/manylinux1_x86_64 x86_64
Then I ran these commands:
$ cd python
$ ./make_py_wheel.sh aarch64
The wheels are built into the the dist
directory and can be installed directly with pip3 install path-to-wheel
.
I will also look into getting this merged into the repo to make this easier next time, but between now and then, let me know if this work around gets up and running again.
@tarunallyo have this issue been resolved ?
@AWSNB Still not working.
Specifications:
Python Version 3.7.9
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.1 LTS"
uname -a
Linux 0d25e43f8fa8 5.4.0-1021-aws #21-Ubuntu SMP Fri Jul 24 09:43:03 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
Logs
Collecting sentencepiece
Using cached sentencepiece-0.1.91.tar.gz (500 kB)
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-kwtwc2v6/sentencepiece/setup.py'"'"'; __file__='"'"'/tmp/pip-install-kwtwc2v6/sentencepiece/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-kwtwc2v6/sentencepiece/pip-egg-info
cwd: /tmp/pip-install-kwtwc2v6/sentencepiece/
Complete output (2 lines):
/bin/sh: 1: pkg-config: not found
Failed to find sentencepiece pkgconfig
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Manually installing via wheel:
pip3 install sentencepiece-0.1.91-cp37-cp37m-manylinux1_x86_64.whl
ERROR: sentencepiece-0.1.91-cp37-cp37m-manylinux1_x86_64.whl is not a supported wheel on this platform.
Also Morfeusz
is not working.
pip3 list
Package Version
----------------- ---------------
attrs 20.2.0
cffi 1.14.2
dataclasses 0.6
future 0.18.2
hypothesis 5.35.3
iniconfig 1.0.1
more-itertools 8.5.0
packaging 20.4
pip 20.0.2
pluggy 0.13.1
py 1.9.0
pycparser 2.20
pyparsing 2.4.7
pytest 6.0.2
PyYAML 5.3.1
setuptools 45.2.0
six 1.15.0
sortedcontainers 2.2.2
toml 0.10.1
torch 1.7.0a0+f02753f
typing-extensions 3.7.4.3
wheel 0.34.2
Can we have proper steps for ubuntu 20.04?
@tarunallyo try these steps, which we going to publish pending your confirming
Make sure Pytorch is installed, if not, follow Pytorch installation steps
On Ubuntu:
# git the source and build/install the libraries.
git clone https://github.com/google/sentencepiece
cd sentencepiece
mkdir build
cd build
cmake ..
make -j 8
sudo make install
sudo ldconfig -v
# move to python directory to build the wheel
cd ../python
vi make_py_wheel.sh
# change the manylinux1_{$1} to manylinux2014_{$1}
sudo python3 setup.py install
With the above steps, the wheel should be installed.
Important Before calling any python script or starting python, one of the libraries need to be set as preload for python.
export LD_PRELOAD=/lib/aarch64-linux-gnu/libtcmalloc_minimal.so.4:/$LD_PRELOAD
python3
Thanks @AWSNB
Additionally, to get Morfeusz
to work, you'll have to compile from source as well. The python wheel is written to work with an older version of Morfeusz. You may want to modify the python project to load the latest version of the shared object file or find an older version that works with the python wheel.
Here are the steps that I used to build Morfeusz:
wget http://download.sgjp.pl/morfeusz/20200913/morfeusz-src-20200913.tar.gz
tar -xf morfeusz-src-20200913.tar.gz
cd Morfeusz/
sudo apt install cmake zip build-essential autotools-dev python3-stdeb python3-pip python3-all-dev python3-pyparsing devscripts libcppunit-dev acl default-jdk swig python3-all-dev python3-stdeb
cd build/
cmake ..
sudo make install
sudo ldconfig -v
At this point you'll have the so file in /usr/local/lib
, but you will still have to modify the python wrapper to work with libmorfeusz2.so.0
instead of libmorfeusz.so.0
.
@AWSNB Tried Morfeusz, still not working. Here are the logs:
root@fa1822a4accd:/Morfeusz# apt install cmake zip build-essential autotools-dev python3-stdeb python3-pip python3-all-dev python3-pyparsin
g devscripts libcppunit-dev acl default-jdk swig python3-all-dev python3-stdeb
Reading package lists... Done
Building dependency tree
Reading state information... Done
acl is already the newest version (2.2.53-6).
autotools-dev is already the newest version (20180224.1).
build-essential is already the newest version (12.8ubuntu1).
cmake is already the newest version (3.16.3-1ubuntu1).
default-jdk is already the newest version (2:1.11-72).
devscripts is already the newest version (2.20.2ubuntu2).
python3-all-dev is already the newest version (3.8.2-0ubuntu2).
python3-pyparsing is already the newest version (2.4.6-1).
zip is already the newest version (3.0-11build1).
libcppunit-dev is already the newest version (1.15.1-2build1).
python3-pip is already the newest version (20.0.2-5ubuntu1).
python3-stdeb is already the newest version (0.8.5-3).
swig is already the newest version (4.0.1-5build1).
0 upgraded, 0 newly installed, 0 to remove and 16 not upgraded.
root@fa1822a4accd:/Morfeusz#
root@fa1822a4accd:/Morfeusz#
root@fa1822a4accd:/Morfeusz# cd build/
bash: cd: build/: No such file or directory
root@fa1822a4accd:/Morfeusz# ls -larth
total 264K
drwxrwxr-x 4 1008 1008 4.0K Jun 13 2019 tests
-rwxrwxr-x 1 1008 1008 318 Jun 13 2019 testPythonWrapper.sh
-rwxrwxr-x 1 1008 1008 596 Jun 13 2019 testJavaWrapper.sh
-rw-rw-r-- 1 root root 2.6K Jun 13 2019 test3.py
-rw-rw-r-- 1 root root 2.6K Jun 13 2019 test2.py
-rw-rw-r-- 1 1008 1008 265 Jun 13 2019 test-darwin.sh
-rwxrwxr-x 1 1008 1008 433 Jun 13 2019 initializeGeneratorTest.sh
-rwxrwxr-x 1 1008 1008 431 Jun 13 2019 initializeAnalyzerTest.sh
-rwxrwxr-x 1 1008 1008 323 Jun 13 2019 extractTags.sh
drwxrwxr-x 2 1008 1008 4.0K Jun 13 2019 doc
-rwxrwxr-x 1 1008 1008 865 Jun 13 2019 doTest.sh
-rwxrwxr-x 1 1008 1008 168 Jun 13 2019 deploy.sh
-rwxrwxr-x 1 1008 1008 1.3K Jun 13 2019 createLibraryDeb.sh
-rwxrwxr-x 1 1008 1008 798 Jun 13 2019 createJavaDeb.sh
-rwxrwxr-x 1 1008 1008 1.8K Jun 13 2019 createGuiDeb.sh
-rwxrwxr-x 1 1008 1008 545 Jun 13 2019 createDevDeb.sh
-rwxrwxr-x 1 1008 1008 899 Jun 13 2019 createDeb.sh
-rwxrwxr-x 1 1008 1008 690 Jun 13 2019 createBinDeb.sh
-rw-rw-r-- 1 1008 1008 1.3K Jun 13 2019 cmake-2.8.12.1-patch.diff
-rw-rw-r-- 1 1008 1008 35 Jun 13 2019 License.txt
-rw-rw-r-- 1 1008 1008 359 Jun 13 2019 CPackConfig.txt
-rwxrwxr-x 1 1008 1008 725 Jun 24 2019 profile.sh
-rwxrwxr-x 1 1008 1008 665 Jun 24 2019 createDictionaryDeb.sh
-rwxrwxr-x 1 1008 1008 998 Jul 1 2019 createGUIDeb.sh
drwxrwxr-x 2 1008 1008 4.0K Feb 9 2020 input
drwxrwxr-x 2 1008 1008 4.0K Mar 10 2020 toolchains
-rwxrwxr-x 1 1008 1008 5.8K Jul 3 13:18 buildOtherDict.sh
-rwxrwxr-x 1 1008 1008 6.1K Jul 3 13:18 buildLinux.sh
-rwxrwxr-x 1 1008 1008 2.0K Jul 3 13:18 buildDict.sh
-rwxrwxr-x 1 1008 1008 12K Jul 7 15:18 buildWindows.sh
-rwxrwxr-x 1 1008 1008 12K Jul 7 15:18 buildDarwin.sh
-rw-rw-r-- 1 1008 1008 8.2K Jul 7 15:18 README
-rw-rw-r-- 1 1008 1008 9.1K Jul 7 15:18 CMakeLists.txt
drwxr-xr-x 1 root root 4.0K Sep 17 12:35 ..
-rw-r--r-- 1 root root 4.0K Sep 17 12:39 CPackConfig.cmake
-rw-r--r-- 1 root root 4.4K Sep 17 12:39 CPackSourceConfig.cmake
-rw-r--r-- 1 root root 17K Sep 17 12:39 CMakeCache.txt
-rw-r--r-- 1 root root 1.7K Sep 17 12:39 cmake_install.cmake
-rw-r--r-- 1 root root 6.8K Sep 17 12:39 CTestTestfile.cmake
-rw-r--r-- 1 root root 23K Sep 17 12:43 Makefile
drwxrwxr-x 13 1008 1008 4.0K Sep 17 12:43 morfeusz
drwxrwxr-x 5 1008 1008 4.0K Sep 17 12:43 fsabuilder
drwxrwxr-x 7 1008 1008 4.0K Sep 17 12:43 gui
drwxrwxr-x 10 1008 1008 4.0K Sep 17 12:43 .
drwxr-xr-x 5 root root 4.0K Sep 17 12:44 CMakeFiles
root@fa1822a4accd:/Morfeusz# cmake .
CMake Warning at CMakeLists.txt:30 (message):
Will build WITHOUT DICTIONARY. Set INPUT_DICTIONARIES option to build with
dictionary.
Version=1.9.16
CMake Warning at CMakeLists.txt:47 (message):
Implicitly setting EMBEDDED_DEFAULT_DICT variable to TRUE
Will use /Morfeusz/input/empty.txt as default dictionary input, /Morfeusz/input/morfeusz-sgjp.tagset as tagset and /Morfeusz/input/segmenty.dat as segmentation rules
-- Configuring done
-- Generating done
-- Build files have been written to: /Morfeusz
root@fa1822a4accd:/Morfeusz# make install
[ 61%] Built target libmorfeusz
[ 65%] Built target morfeusz_generator
[ 69%] Built target morfeusz_analyzer_old
[ 73%] Built target morfeusz_analyzer
[ 80%] Built target test_runner
[ 82%] Built target generate-java-wrapper
[ 88%] Built target libjmorfeusz
[ 90%] Building Java objects for jmorfeusz.jar
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0xC5) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0x82) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0xC4) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0x87) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:17: error: unmappable character (0xC4) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:17: error: unmappable character (0x87) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:19: error: unmappable character (0xC5) for encoding US-ASCII
* * Ja * zosta??*em *
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:19: error: unmappable character (0x82) for encoding US-ASCII
* * Ja * zosta??*em *
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:25: error: unmappable character (0xC5) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:25: error: unmappable character (0x82) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:22: error: unmappable character (0xC5) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:22: error: unmappable character (0x82) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:22: error: unmappable character (0xC4) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:22: error: unmappable character (0x87) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:24: error: unmappable character (0xC4) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:24: error: unmappable character (0x87) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:26: error: unmappable character (0xC5) for encoding US-ASCII
* * Ja * zosta??*em *
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:26: error: unmappable character (0x82) for encoding US-ASCII
* * Ja * zosta??*em *
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:32: error: unmappable character (0xC5) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/_MorphInterpretation.java:32: error: unmappable character (0x82) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC5) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xBC) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC3) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xB3) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC5) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0x82) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC4) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/Morfeusz/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0x87) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
28 errors
make[2]: *** [morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/build.make:88: morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/java_compiled_jmorfeusz] Error 1
make[1]: *** [CMakeFiles/Makefile2:618: morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/all] Error 2
make: *** [Makefile:163: all] Error 2
root@fa1822a4accd:/Morfeusz# ldconfig -v
/sbin/ldconfig.real: Can't stat /usr/local/lib/aarch64-linux-gnu: No such file or directory
/sbin/ldconfig.real: Path `/usr/lib/aarch64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/lib/aarch64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/usr/lib/aarch64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/usr/lib' given more than once
/lib/aarch64-linux-gnu:
libXdmcp.so.6 -> libXdmcp.so.6.0.0
libyaml-0.so.2 -> libyaml-0.so.2.0.6
libsqlite3.so.0 -> libsqlite3.so.0.8.6
libctf-nobfd.so.0 -> libctf-nobfd.so.0.0.0
libnspr4.so -> libnspr4.so
libpcsclite.so.1 -> libpcsclite.so.1.0.0
libGL.so.1 -> libGL.so.1.7.0
libnghttp2.so.14 -> libnghttp2.so.14.19.0
libSM.so.6 -> libSM.so.6.0.1
librtmp.so.1 -> librtmp.so.1
libfl.so.2 -> libfl.so.2.0.0
librhash.so.0 -> librhash.so.0
libisl.so.22 -> libisl.so.22.0.1
libctf.so.0 -> libctf.so.0.0.0
libxcb.so.1 -> libxcb.so.1.1.0
libopcodes-2.34-system.so -> libopcodes-2.34-system.so
libssl3.so -> libssl3.so
libssh.so.4 -> libssh.so.4.8.4
libmagic.so.1 -> libmagic.so.1.0.0
libxcb-glx.so.0 -> libxcb-glx.so.0.0.0
libfontenc.so.1 -> libfontenc.so.1.0.0
libcups.so.2 -> libcups.so.2
libgettextlib-0.19.8.1.so -> libgettextlib-0.19.8.1.so
libsigsegv.so.2 -> libsigsegv.so.2.0.5
libpython3.8.so.1.0 -> libpython3.8.so.1.0
libhcrypto.so.4 -> libhcrypto.so.4.1.0
libICE.so.6 -> libICE.so.6.3.0
libcppunit-1.15.so.1 -> libcppunit.so
libhistory.so.8 -> libhistory.so.8.0
libcrypto.so.1.1 -> libcrypto.so.1.1
libXdamage.so.1 -> libXdamage.so.1.1.0
libjsoncpp.so.1 -> libjsoncpp.so.1.7.4
libelf.so.1 -> libelf-0.176.so
libpng16.so.16 -> libpng16.so.16.37.0
libnssutil3.so -> libnssutil3.so
libXpm.so.4 -> libXpm.so.4.11.0
libgpgme.so.11 -> libgpgme.so.11.22.1
libxshmfence.so.1 -> libxshmfence.so.1.0.0
libbfd-2.34-system.so -> libbfd-2.34-system.so
libLLVM-10.so.1 -> libLLVM-10.so.1
libdrm_nouveau.so.2 -> libdrm_nouveau.so.2.0.0
libkeyutils.so.1 -> libkeyutils.so.1.8
libgmodule-2.0.so.0 -> libgmodule-2.0.so.0.6400.3
libXt.so.6 -> libXt.so.6.0.0
libXau.so.6 -> libXau.so.6.0.0
libfreetype.so.6 -> libfreetype.so.6.17.1
libXext.so.6 -> libXext.so.6.4.0
libsasl2.so.2 -> libsasl2.so.2.0.25
libfontconfig.so.1 -> libfontconfig.so.1.12.0
libgif.so.7 -> libgif.so.7.1.0
libXv.so.1 -> libXv.so.1.0.0
libpipeline.so.1 -> libpipeline.so.1.5.2
libssl.so.1.1 -> libssl.so.1.1
libcc1.so.0 -> libcc1.so.0.0.0
libicudata.so.66 -> libicudata.so.66.1
libXxf86vm.so.1 -> libXxf86vm.so.1.0.0
libsmime3.so -> libsmime3.so
libxcb-sync.so.1 -> libxcb-sync.so.1.0.0
libasan.so.5 -> libasan.so.5.0.0
libdrm.so.2 -> libdrm.so.2.4.0
libdbus-1.so.3 -> libdbus-1.so.3.19.11
libgio-2.0.so.0 -> libgio-2.0.so.0.6400.3
liblber-2.4.so.2 -> liblber-2.4.so.2.10.12
libarchive.so.13 -> libarchive.so.13.4.0
libexpatw.so.1 -> libexpatw.so.1.6.11
libexpat.so.1 -> libexpat.so.1.6.11
libedit.so.2 -> libedit.so.2.0.63
libubsan.so.1 -> libubsan.so.1.0.0
libgdbm.so.6 -> libgdbm.so.6.0.0
libksba.so.8 -> libksba.so.8.11.6
libbsd.so.0 -> libbsd.so.0.10.0
libuv.so.1 -> libuv.so.1.0.0
libmpdec.so.2 -> libmpdec.so.2.4.2
libgssapi.so.3 -> libgssapi.so.3.0.0
libavahi-client.so.3 -> libavahi-client.so.3.2.9
libcroco-0.6.so.3 -> libcroco-0.6.so.3.0.1
libapparmor.so.1 -> libapparmor.so.1.6.1
libbrotlienc.so.1 -> libbrotlienc.so.1.0.7
libitm.so.1 -> libitm.so.1.0.0
libsensors.so.5 -> libsensors.so.5.0.0
liblsan.so.0 -> liblsan.so.0.0.0
libpsl.so.5 -> libpsl.so.5.3.2
libassuan.so.0 -> libassuan.so.0.8.3
libk5crypto.so.3 -> libk5crypto.so.3.1
libidn.so.11 -> libidn.so.11.6.16
libgssapi_krb5.so.2 -> libgssapi_krb5.so.2.2
liblcms2.so.2 -> liblcms2.so.2.0.8
libxcb-dri3.so.0 -> libxcb-dri3.so.0.0.0
libasound.so.2 -> libasound.so.2.0.0
libxkbfile.so.1 -> libxkbfile.so.1.0.2
libkrb5.so.3 -> libkrb5.so.3.3
libXi.so.6 -> libXi.so.6.1.0
libGLX_mesa.so.0 -> libGLX_mesa.so.0.0.0
libhx509.so.5 -> libhx509.so.5.0.0
libXft.so.2 -> libXft.so.2.3.3
libxcb-dri2.so.0 -> libxcb-dri2.so.0.0.0
libicutest.so.66 -> libicutest.so.66.1
libavahi-common.so.3 -> libavahi-common.so.3.5.3
libkrb5support.so.0 -> libkrb5support.so.0.1
libXinerama.so.1 -> libXinerama.so.1.0.0
libX11-xcb.so.1 -> libX11-xcb.so.1.0.0
libxcb-present.so.0 -> libxcb-present.so.0.0.0
libatomic.so.1 -> libatomic.so.1.2.0
libgthread-2.0.so.0 -> libgthread-2.0.so.0.6400.3
libicutu.so.66 -> libicutu.so.66.1
libicui18n.so.66 -> libicui18n.so.66.1
libmpc.so.3 -> libmpc.so.3.1.0
libatk-1.0.so.0 -> libatk-1.0.so.0.23510.1
libltdl.so.7 -> libltdl.so.7.3.1
libtsan.so.0 -> libtsan.so.0.0.0
libX11.so.6 -> libX11.so.6.3.0
libnpth.so.0 -> libnpth.so.0.1.2
libmpfr.so.6 -> libmpfr.so.6.0.2
libgdbm_compat.so.4 -> libgdbm_compat.so.4.0.0
libheimbase.so.1 -> libheimbase.so.1.0.0
libperl.so.5.30 -> libperl.so.5.30.0
libgomp.so.1 -> libgomp.so.1.0.0
libjpeg.so.8 -> libjpeg.so.8.2.2
libxcb-shape.so.0 -> libxcb-shape.so.0.0.0
libXxf86dga.so.1 -> libXxf86dga.so.1.0.0
libatk-bridge-2.0.so.0 -> libatk-bridge-2.0.so.0.0.0
libXfixes.so.3 -> libXfixes.so.3.1.0
libreadline.so.8 -> libreadline.so.8.0
libatspi.so.0 -> libatspi.so.0.0.1
libXaw.so.7 -> libXaw7.so.7.0.0
libroken.so.18 -> libroken.so.18.1.0
libgettextsrc-0.19.8.1.so -> libgettextsrc-0.19.8.1.so
libicuuc.so.66 -> libicuuc.so.66.1
libxml2.so.2 -> libxml2.so.2.9.10
libnss3.so -> libnss3.so
libcbor.so.0.6 -> libcbor.so.0.6.0
libgobject-2.0.so.0 -> libgobject-2.0.so.0.6400.3
libGLX.so.0 -> libGLX.so.0.0.0
libwind.so.0 -> libwind.so.0.0.0
libplds4.so -> libplds4.so
libplc4.so -> libplc4.so
libXcomposite.so.1 -> libXcomposite.so.1.0.0
libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
libXtst.so.6 -> libXtst.so.6.1.0
libbrotlidec.so.1 -> libbrotlidec.so.1.0.7
libuchardet.so.0 -> libuchardet.so.0.0.6
libcurl-gnutls.so.4 -> libcurl-gnutls.so.4.6.0
libheimntlm.so.0 -> libheimntlm.so.0.1.0
libfido2.so.1 -> libfido2.so.1.3.1
libXrandr.so.2 -> libXrandr.so.2.2.0
libbrotlicommon.so.1 -> libbrotlicommon.so.1.0.7
libre2.so.5 -> libre2.so.5.0.0
libXmu.so.6 -> libXmu.so.6.2.0
libkrb5.so.26 -> libkrb5.so.26.0.0
libdrm_amdgpu.so.1 -> libdrm_amdgpu.so.1.0.0
libXrender.so.1 -> libXrender.so.1.3.0
libldap_r-2.4.so.2 -> libldap_r-2.4.so.2.10.12
libasn1.so.8 -> libasn1.so.8.0.0
libcurl.so.4 -> libcurl.so.4.6.0
libXmuu.so.1 -> libXmuu.so.1.0.0
libicuio.so.66 -> libicuio.so.66.1
libglib-2.0.so.0 -> libglib-2.0.so.0.6400.3
libdrm_radeon.so.1 -> libdrm_radeon.so.1.0.1
libglapi.so.0 -> libglapi.so.0.0.0
libncursesw.so.6 -> libncursesw.so.6.2
libidn2.so.0 -> libidn2.so.0.3.6
libz.so.1 -> libz.so.1.2.11
libutil.so.1 -> libutil-2.31.so
libffi.so.7 -> libffi.so.7.1.0
libzstd.so.1 -> libzstd.so.1.4.4
libpam_misc.so.0 -> libpam_misc.so.0.82.1
libpanelw.so.6 -> libpanelw.so.6.2
/sbin/ldconfig.real: /lib/aarch64-linux-gnu/ld-2.31.so is the dynamic linker, ignoring
ld-linux-aarch64.so.1 -> ld-2.31.so
librt.so.1 -> librt-2.31.so
libapt-private.so.0.0 -> libapt-private.so.0.0.0
libpthread.so.0 -> libpthread-2.31.so
libpcre.so.3 -> libpcre.so.3.13.3
libpamc.so.0 -> libpamc.so.0.82.1
libsystemd.so.0 -> libsystemd.so.0.28.0
libdb-5.3.so -> libdb-5.3.so
libaudit.so.1 -> libaudit.so.1.0.0
libmemusage.so -> libmemusage.so
libSegFault.so -> libSegFault.so
libgcrypt.so.20 -> libgcrypt.so.20.2.5
libprocps.so.8 -> libprocps.so.8.0.2
libresolv.so.2 -> libresolv-2.31.so
libhogweed.so.5 -> libhogweed.so.5.0
libmount.so.1 -> libmount.so.1.1.0
libnss_dns.so.2 -> libnss_dns-2.31.so
libe2p.so.2 -> libe2p.so.2.3
libfdisk.so.1 -> libfdisk.so.1.1.0
libunistring.so.2 -> libunistring.so.2.1.0
libnettle.so.7 -> libnettle.so.7.0
libselinux.so.1 -> libselinux.so.1
libtinfo.so.6 -> libtinfo.so.6.2
libstdc++.so.6 -> libstdc++.so.6.0.28
libnss_files.so.2 -> libnss_files-2.31.so
libnss_compat.so.2 -> libnss_compat-2.31.so
libmenuw.so.6 -> libmenuw.so.6.2
libgpg-error.so.0 -> libgpg-error.so.0.28.0
libbz2.so.1.0 -> libbz2.so.1.0.4
libcrypt.so.1 -> libcrypt.so.1.1.0
libuuid.so.1 -> libuuid.so.1.3.0
libss.so.2 -> libss.so.2.0
libanl.so.1 -> libanl-2.31.so
libapt-pkg.so.6.0 -> libapt-pkg.so.6.0.0
liblzma.so.5 -> liblzma.so.5.2.4
libgcc_s.so.1 -> libgcc_s.so.1
libsemanage.so.1 -> libsemanage.so.1
libpcreposix.so.3 -> libpcreposix.so.3.13.3
libmenu.so.6 -> libmenu.so.6.2
libnss_nisplus.so.2 -> libnss_nisplus-2.31.so
libcom_err.so.2 -> libcom_err.so.2.1
libpcre2-8.so.0 -> libpcre2-8.so.0.9.0
libdl.so.2 -> libdl-2.31.so
libudev.so.1 -> libudev.so.1.6.17
libncurses.so.6 -> libncurses.so.6.2
libnsl.so.1 -> libnsl-2.31.so
libformw.so.6 -> libformw.so.6.2
libpanel.so.6 -> libpanel.so.6.2
libnss_hesiod.so.2 -> libnss_hesiod-2.31.so
libcap-ng.so.0 -> libcap-ng.so.0.0.0
liblz4.so.1 -> liblz4.so.1.9.2
libBrokenLocale.so.1 -> libBrokenLocale-2.31.so
libsepol.so.1 -> libsepol.so.1
libpam.so.0 -> libpam.so.0.84.2
libattr.so.1 -> libattr.so.1.1.2448
libdebconfclient.so.0 -> libdebconfclient.so.0.0.0
libseccomp.so.2 -> libseccomp.so.2.4.3
libpcprofile.so -> libpcprofile.so
libc.so.6 -> libc-2.31.so
libtic.so.6 -> libtic.so.6.2
libgnutls.so.30 -> libgnutls.so.30.27.0
libthread_db.so.1 -> libthread_db-1.0.so
libext2fs.so.2 -> libext2fs.so.2.4
libform.so.6 -> libform.so.6.2
libacl.so.1 -> libacl.so.1.1.2253
libsmartcols.so.1 -> libsmartcols.so.1.1.0
libm.so.6 -> libm-2.31.so
libp11-kit.so.0 -> libp11-kit.so.0.3.0
libtasn1.so.6 -> libtasn1.so.6.6.0
libnss_nis.so.2 -> libnss_nis-2.31.so
libgmp.so.10 -> libgmp.so.10.4.0
libblkid.so.1 -> libblkid.so.1.1.0
/usr/lib/aarch64-linux-gnu/libfakeroot:
libfakeroot-0.so -> libfakeroot-tcp.so
/usr/local/lib:
/lib:
root@fa1822a4accd:/Morfeusz# ls /usr/local/lib/python3.8/dist-packages/
root@fa1822a4accd:/Morfeusz#
@AWSNB Thanks for the sentencepiece steps, Its resolved. Now, only Morpheus left.
@tarunallyo
the following installs it, i didnt run it to see how it works
wget http://download.sgjp.pl/morfeusz/20200913/morfeusz-src-20200913.tar.gz
tar -xf morfeusz-src-20200913.tar.gz
cd Morfeusz/
sudo apt install cmake zip build-essential autotools-dev python3-stdeb python3-pip python3-all-dev python3-pyparsing devscripts libcppunit-dev acl default-jdk swig python3-all-dev python3-stdeb
mkdir build
cd build
cmake ..
sudo make install
sudo ldconfig -v
sudo PYTHONPATH=/usr/local/lib/python make install-builder
we tried this on ubuntu 20 and ubuntu18
@AWSNB Thanks.
But the above installation steps are not working.
It is not working as got error while make install
LOGS
[ 65%] Built target morfeusz_generator
[ 69%] Built target morfeusz_analyzer_old
[ 73%] Built target morfeusz_analyzer
[ 80%] Built target test_runner
[ 82%] Built target generate-java-wrapper
[ 88%] Built target libjmorfeusz
[ 90%] Building Java objects for jmorfeusz.jar
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0xC5) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0x82) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0xC4) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:15: error: unmappable character (0x87) for encoding US-ASCII
* | {1,2,"zosta??","zosta??","praet:sg:m1.m2.m3:perf"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:17: error: unmappable character (0xC4) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:17: error: unmappable character (0x87) for encoding US-ASCII
* __| ____| __{2,3,"em","by??","aglt:sg:pri:imperf:wok"}
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:19: error: unmappable character (0xC5) for encoding US-ASCII
* * Ja * zosta??*em *
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:19: error: unmappable character (0x82) for encoding US-ASCII
* * Ja * zosta??*em *
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:25: error: unmappable character (0xC5) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/MorphInterpretation.java:25: error: unmappable character (0x82) for encoding US-ASCII
* Note that the word 'zosta??em' got broken into 2 separate segments.
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC5) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xBC) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC3) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xB3) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC5) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0x82) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0xC4) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
/root/Morfeusz/build/morfeusz/wrappers/java/pl/sgjp/morfeusz/app/App.java:18: error: unmappable character (0x87) for encoding US-ASCII
ResultsIterator it = morfeusz.analyseAsIterator("Ala ma kota i ????????.");
^
18 errors
make[2]: *** [morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/build.make:73: morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/java_compiled_jmorfeusz] Error 1
make[1]: *** [CMakeFiles/Makefile2:618: morfeusz/wrappers/java/CMakeFiles/jmorfeusz.dir/all] Error 2
make: *** [Makefile:163: all] Error 2
@tarunallyo i assume u running on ubuntu 18 or 20, right ?
the prep instructions with sudo yum install may have been cut with the way CODE is formatted (there is no line wrap). so i want to check if u captured all these prerequisites becasue it does same something in java is missing:
sudo apt install cmake zip build-essential autotools-dev python3-stdeb python3-pip python3-all-dev python3-pyparsing devscripts libcppunit-dev acl default-jdk swig python3-all-dev python3-stdeb
It looks like a locale issue.
@tarunallyo Can you please run the following:
wget http://download.sgjp.pl/morfeusz/20200913/morfeusz-src-20200913.tar.gz
tar -xf morfeusz-src-20200913.tar.gz
cd Morfeusz/
sudo apt install cmake zip build-essential autotools-dev python3-stdeb python3-pip python3-all-dev python3-pyparsing devscripts libcppunit-dev acl default-jdk swig python3-all-dev python3-stdeb
export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
mkdir build
cd build
cmake ..
sudo make install
sudo ldconfig -v
sudo PYTHONPATH=/usr/local/lib/python make install-builder
The export JAVA_TOOL_OPTIONS fixes the locale issue in Java. Also, if you are having issues with the last command i.e. make install-builder, please try:
sudo PYTHONPATH=`which python3` make install-builder
The above steps worked! Many thanks!
Is there a way I can install morfeusz2
also? Can't find its tar gz on http://download.sgjp.pl/morfeusz/20200913.
morfeusz-src tar is there but not something like morfeusz2-src.
As for morfeusz2
library, instructions provided here are actually to install morfeusz-builder
which is a different thing.
But we want to install morfeusz2
library with python bindings to it.
For Ubuntu you can use simple apt-get commands:
sudo apt install morfeusz2
sudo apt install python3-morfeusz2
The above commands create the following files for Ubuntu 18.04 in the python's dist-packages folder:
morfeusz2.py
_morfeusz2.cpython-37m-x86_64-linux-gnu.so
After that we can use morfeusz2
from the python as follows:
>>> import morfeusz2
>>> morf = morfeusz2.Morfeusz()
>>> morf.generate("developer")
I can see a separate files for python bindings here: http://morfeusz.sgjp.pl/download/en, but they are created for amd64 systems. Do you have any thoughts how to resolve this?
@tarunallyo try these steps, which we going to publish pending your confirming
4.3 Sentencepiece
Make sure Pytorch is installed, if not, follow Pytorch installation steps
On Ubuntu:
# git the source and build/install the libraries. git clone https://github.com/google/sentencepiece cd sentencepiece mkdir build cd build cmake .. make -j 8 sudo make install sudo ldconfig -v # move to python directory to build the wheel cd ../python vi make_py_wheel.sh # change the manylinux1_{$1} to manylinux2014_{$1} sudo python3 setup.py install
With the above steps, the wheel should be installed.
Important Before calling any python script or starting python, one of the libraries need to be set as preload for python.
export LD_PRELOAD=/lib/aarch64-linux-gnu/libtcmalloc_minimal.so.4:/$LD_PRELOAD python3
I am still facing the issue even after following provided steps. It says 'pkg-config not found' I am pasting few log details:
Install the project... -- Install configuration: "" -- Up-to-date: /home/user/sentencepiece/python/bundled/lib/pkgconfig/sentencepiece.pc -- Up-to-date: /home/user/sentencepiece/python/bundled/lib/libsentencepiece.a -- Up-to-date: /home/user/sentencepiece/python/bundled/lib/libsentencepiece_train.a -- Up-to-date: /home/user/sentencepiece/python/bundled/bin/spm_encode -- Up-to-date: /home/user/sentencepiece/python/bundled/bin/spm_decode -- Up-to-date: /home/user/sentencepiece/python/bundled/bin/spm_normalize -- Up-to-date: /home/user/sentencepiece/python/bundled/bin/spm_train -- Up-to-date: /home/user/sentencepiece/python/bundled/bin/spm_export_vocab -- Up-to-date: /home/user/sentencepiece/python/bundled/include/sentencepiece_trainer.h -- Up-to-date: /home/user/sentencepiece/python/bundled/include/sentencepiece_processor.h env: ‘pkg-config’: No such file or directory Failed to find sentencepiece pkg-config
Any other workaround?
There is a 'sentencepiece' wheel for aarch64: https://pypi.org/project/sentencepiece/#files
$ pip install sentencepiece Collecting sentencepiece Using cached sentencepiece-0.1.96-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB) Installing collected packages: sentencepiece Successfully installed sentencepiece-0.1.96
Reopen the issue if you are still having problems.
sentencepiece is throwing the below error:
The packages
Morfeusz
is also not getting installed.Is there any sort of documentation available for the same?