google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.
Apache License 2.0
10.26k stars 1.17k forks source link

Tips for Termux installation #950

Open Manamama opened 10 months ago

Manamama commented 10 months ago

This had to be added, when compiling it in Termux:

.../sentencepiece/CMakeLists.txt

set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -llog"): This line is setting a CMake variable called CMAKE_SHARED_LINKER_FLAGS, which specifies the flags to be used by the linker when creating shared libraries⁵. The -llog flag tells the linker to link against liblog.so, which provides the __android_log_write function that was causing the undefined reference error in your initial build attempt⁶.

Compile: cmake .. -DCMAKE_INSTALL_PREFIX=$PREFIX: Here, you're invoking CMake and passing it a command-line argument to set the CMAKE_INSTALL_PREFIX variable. This variable determines the directory where the project will be installed[^10^]¹¹. By setting it to $PREFIX, you're telling CMake to install the project in the directory specified by the PREFIX environment variable, which in Termux is typically /data/data/com.termux/files/usr.

After applying these fixes and successfully building the project:

cd ..
cd python/
python setup.py bdist_wheel
pip install dist/sentencepiece*.whl
pip show sentencepiece
ghost commented 10 months ago

😍😭❤️ you're a genius I was about to give up , please please please can you explain this in more detail I am using tarfile to install this and I cannot find the python/ directory 😢 and I am a little bit confused about where I've to make that edit is that line number specific in that file ?

Manamama commented 10 months ago

@FindingMeaning : I am not a genius, by far. I run most such snags by Ms Bing (mostly in its Precise mode) or Claude AI, with my neofetch results and just ask it politely, yet insistently, and interactively.

Re your question - it requires

git clone https://github.com/google/sentencepiece
cd sentencepiece

first, obviously, then changing these tidbits ( in .../sentencepiece/CMakeLists.txt, as I wrote) that you will see, and maybe paying attention to the missing libraries that python setup ... or pip install ... logs may throw at you, and installing these missing bits (modules etc.), if any.

In short - just paste these errors to a live session with any LLM/AI and explain to it politely what you are after, assuming it is also a blind man seeing (your) elephant, that is the system errors.

Manamama commented 10 months ago

Another tip for Termux lovers (like me)

~ $ pkg list-all | grep gperf

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

gperf/stable,now 3.1-7 aarch64 [installed,automatic]
~ $ 

or

cmake-curses-gui/stable 3.28.1 aarch64
cmake/stable,now 3.28.1 aarch64 [installed]
extra-cmake-modules/x11 5.112.0 aarch64
...
libgoogleperf-tools-dev
~ $ 

(e.g. for MALLOC error) is what is needed to compile, the termux equivalents thereof

Anyhow, use that syntax to find the Termux precompiled equivalents of most anything: numpy (needed for numba), scikit, torch, etc.

Another trick that (may, a big may) work out of the box

~ $ time  pip install --no-dependencies   --no-binary SentencePiece SentencePiece
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: SentencePiece in /data/data/com.termux/files/usr/lib/python3.11/site-packages (0.2.0)

real    0m2.616s
user    0m1.569s
sys 0m0.577s
~ $ 
Manamama commented 10 months ago

FYI, I know it should not be here, as it is not a bug, but I wrote a quick script that runs it in Termux etc, with cute statistics.

Ms Bing explained it to me via PlantUML as below SentencePiece

ghost commented 10 months ago

first, obviously, then changing these tidbits ( in .../sentencepiece/CMakeLists.txt, as I wrote) that you will see, and maybe paying attention to the missing libraries that python setup ... or pip install ... logs may throw at you, and installing these missing bits (modules etc.), if any.

Thankyou very very much 😌 I've successfully installed sentence-transformers because of you, for me you'll always be a genius and a person who helped me when I was loosing hope 😭

RDT_20231219_084911264719658203460677.jpg

Thank you very very very much once again ☺️❤️

xhy2008 commented 9 months ago

Which version of sentencepiece are you using? I looked through the CMakeLists.txt,but I didn't find the line you mentioned. I am using the latest version,and it still make errors.

ghost commented 9 months ago

Just add this line at top of CMakeLists.txt set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -llog") if you still have problems see this Reddit Post.