su77ungr / CASALIOY

♾️ toolkit for air-gapped LLMs on consumer-grade hardware
Apache License 2.0
230 stars 31 forks source link

Illegal Instruction when running python casalioy/startLLM.py on Mac m1 in docker container (with or without --platform linux/amd64 run param) #84

Closed rus-mihai closed 1 year ago

rus-mihai commented 1 year ago

.env

Generic

TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2 TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF USE_MLOCK=true

Ingestion

PERSIST_DIRECTORY=db DOCUMENTS_DIRECTORY=source_documents INGEST_CHUNK_SIZE=500 INGEST_CHUNK_OVERLAP=50

Generation

MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin MODEL_TEMP=0.8 MODEL_N_CTX=1024 # Max total size of prompt+answer MODEL_MAX_TOKENS=256 # Max size of answer MODEL_STOP=[STOP] CHAIN_TYPE=betterstuff N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db N_FORWARD_DOCUMENTS=100 # How many documents to forward to the LLM, chosen among those retrieved N_GPU_LAYERS=4

Python version

Python 3.11.3

System

Debian GNU/Linux 11 (bullseye) (DOCKER container)

CASALIOY version

su77ungr/casalioy:stable

Information

Related Components

Reproduction

Steps to reproduce (on Mac m1):

docker pull su77ungr/casalioy:stable docker run -it su77ungr/casalioy:stable /bin/bash

python casalioy/ingest.py

Downloading model sentence-transformers/all-MiniLM-L6-v2 from HF Downloading (…)_Pooling/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 190/190 [00:00<00:00, 684kB/s] Downloading (…)55de9125/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 612/612 [00:00<00:00, 3.70MB/s] Downloading (…)125/data_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39.3k/39.3k [00:00<00:00, 4.90MB/s] Downloading (…)ce_transformers.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 626kB/s] Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 641kB/s] Downloading (…)nce_bert_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 53.0/53.0 [00:00<00:00, 309kB/s] Downloading (…)5de9125/modules.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 349/349 [00:00<00:00, 2.15MB/s] Downloading (…)okenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 350/350 [00:00<00:00, 1.64MB/s] Downloading (…)e9125/tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 466k/466k [00:00<00:00, 1.31MB/s] Downloading pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90.9M/90.9M [00:18<00:00, 4.96MB/s] Fetching 10 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:24<00:00, 2.41s/it] Downloading model eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin from HF Downloading ggml-vic7b-q5_1.bin: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.06G/5.06G [11:31<00:00, 7.31MB/s] Fetching 1 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [11:37<00:00, 697.51s/it] The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by callingtransformers.utils.move_cache(). 0it [00:00, ?it/s] Scanning files Processing state_of_the_union.txt Processing 90 chunks Creating a new collection, size=384 Saving 90 chunks Saved, the collection now holds 90 documents. Processed state_of_the_union.txt Processing sample.csv Processing 9 chunks Saving 9 chunks Saved, the collection now holds 99 documents. Processed sample.csv Processing shor.pdf Processing 22 chunks Saving 22 chunks Saved, the collection now holds 121 documents. Processed shor.pdf Processing Muscle Spasms Charley Horse MedlinePlus.html [nltk_data] Downloading package punkt to /root/nltk_data...===================> ] 3/ 7 eta [00:19] [nltk_data] Unzipping tokenizers/punkt.zip. 21 [nltk_data] Downloading package averaged_perceptron_tagger to 2 [nltk_data] /root/nltk_data... [nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip. Processing 15 chunks Saving 15 chunks Saved, the collection now holds 136 documents. Processed Muscle Spasms Charley Horse MedlinePlus.html Processing Easy_recipes.epub Processing 31 chunks Saving 31 chunks Saved, the collection now holds 167 documents. Processed Easy_recipes.epub Processing Constantinople.docx Processing 13 chunks Saving 13 chunks Saved, the collection now holds 179 documents. Processed Constantinople.docx Processing LLAMA Leveraging Object-Oriented Programming for Designing a Logging Framework-compressed.pdf Processing 14 chunks Saving 14 chunks Saved, the collection now holds 193 documents. Processed LLAMA Leveraging Object-Oriented Programming for Designing a Logging Framework-compressed.pdf 100.0% [==================================================================================================================================================================>] 7/ 7 eta [00:00] Done

root@6e62f96184c4:/srv/CASALIOY# python casalioy/startLLM.py found local model dir at models/sentence-transformers/all-MiniLM-L6-v2 found local model file at models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin

Illegal instruction

Expected behavior

I would expect to start the chatting

su77ungr commented 1 year ago

just merged #80. Hope this resolves this

rus-mihai commented 1 year ago

unfortunately it doesn;t solve it , i just pulled latest and retried. But now I have more details :

Fatal Python error: Illegal instruction

Current thread 0x00007fffff75d4c0 (most recent call first): File "/srv/CASALIOY/.venv/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 154 in llama_context_default_params File "/srv/CASALIOY/.venv/lib/python3.11/site-packages/llama_cpp/llama.py", line 130 in init File "/srv/CASALIOY/.venv/lib/python3.11/site-packages/langchain/llms/llamacpp.py", line 133 in validate_environment File "/srv/CASALIOY/casalioy/startLLM.py", line 57 in init File "/srv/CASALIOY/casalioy/startLLM.py", line 123 in main File "/srv/CASALIOY/casalioy/startLLM.py", line 135 in Extension modules: grpc._cython.cygrpc, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pydantic.typing, pydantic.errors, pydantic.version, pydantic.utils, pydantic.class_validators, pydantic.config, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.error_wrappers, pydantic.fields, pydantic.parse, pydantic.schema, pydantic.main, pydantic.dataclasses, pydantic.annotated_types, pydantic.decorator, pydantic.env_settings, pydantic.tools, pydantic, yaml._yaml, charset_normalizer.md, multidict._multidict, yarl._quoting_c, aiohttp._helpers, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket, frozenlist._frozenlist, tornado.speedups, sqlalchemy.cyextension.collections, sqlalchemy.cyextension.immutabledict, sqlalchemy.cyextension.processors, sqlalchemy.cyextension.resultproxy, sqlalchemy.cyextension.util, greenlet._greenlet, numexpr.interpreter, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, scipy._lib._ccallback_c, numpy.linalg.lapack_lite, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._mvn, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._rcont.rcont, regex._regex, sklearn.__check_build._check_build, sklearn.utils.murmurhash, sklearn.utils._isfinite, sklearn.utils._openmp_helpers, sklearn.utils._vector_sentinel, sklearn.feature_extraction._hashing_fast, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.utils._cython_blas, sklearn.svm._libsvm, sklearn.svm._liblinear, sklearn.svm._libsvm_sparse, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.utils.arrayfuncs, sklearn.utils._typedefs, sklearn.utils._readonly_array_wrapper, sklearn.metrics._dist_metrics, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_fast, sklearn.linear_model._cd_fast, sklearn._loss._loss, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.datasets._svmlight_format_fast, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, sentencepiece._sentencepiece, PIL._imaging (total: 202) Illegal instruction

su77ungr commented 1 year ago

Jap ok. I think this is caused by the slim base

su77ungr commented 1 year ago

Might give this a try:

https://github.com/su77ungr/CASALIOY/tree/docker-fix

hippalectryon-0 commented 1 year ago

Most likely not caused by slim, since it works fine on my end :P

rus-mihai commented 1 year ago

Most likely not caused by slim, since it works fine on my end :P

You are running it also on a M1 (arm) MAC OS ?

PS : i'm building the image now. will let you know

rus-mihai commented 1 year ago

I actually have it working now. Not sure because of the python base image or, because i built the image locally. Thank you

hippalectryon-0 commented 1 year ago

Update: If I pull from ~su77ungr/casalioy:latest~ su77ungr/casalioy:stable I get the same error as you do, but if I built it from my computer it works fine. Looks like you have the same behavior. Maybe it's an issue of architecture detected during the compilation ?

I think we should reopen this

su77ungr commented 1 year ago

Oh we should purge :latest it's four days old now

hippalectryon-0 commented 1 year ago

Sorry I meant stable

su77ungr commented 1 year ago

Nah, that's my fault. We should bring this up to latest. Currently stable works besides #83? Then we should keep this image and bring the :latest up too speed

su77ungr commented 1 year ago

Update: If I pull from ~su77ungr/casalioy:latest~ su77ungr/casalioy:stable I get the same error as you do, but if I built it from my computer it works fine. Looks like you have the same behavior. Maybe it's an issue of architecture detected during the compilation ?

I think we should reopen this

That's so weird I tested several VMs and no problem. Also I have no idea about MacOS never used it in life.

su77ungr commented 1 year ago

@hippalectryon-0 why is the docker image suddenly 750MB again?

hippalectryon-0 commented 1 year ago

You tell me :P did you change something ?

su77ungr commented 1 year ago

let's switch to #87