exec: python: not found

F42J commented 1 month ago

LocalAI version: latest-aio-cpu/latest-cpu (tested with both

Environment, CPU architecture, OS, and Version: Linux Desktop-j42f 6.8.0-44-generic #44-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 13 13:35:26 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug Requests to the embeddings endpoint cause a HTTP 500 error with the following error message when used with a custom model. {"error":{"code":500,"message":"grpc service not ready","type":""}}

The error seems to be caused by a missing python install as indicated by the following debug output: 8:12PM DBG GRPC(pytorch_model.bin-127.0.0.1:38153): stderr /build/backend/python/sentencetransformers/../common/libbackend.sh: line 180: exec: python: not found

To Reproduce

Setup Docker hosted LocalAI instanz with the following docker file and model configuration

Dockerfile FROM localai/localai:latest-cpu RUN apt-get update && apt-get install -y wget && \ wget -O /build/models/pytorch_model.bin https://huggingface.co/medicalai/ClinicalBERT/resolve/main/pytorch_model.bin ENV DEBUG=true COPY models/* /build/models/

Modelconfiguration name: clinicalbert backend: sentencetransformers embeddings: true parameters: model: pytorch_model.bin

Build and launch dockercontainer without any further options
Request an embedding from the mebedding endpoint using model=clinicalbert

Expected behavior A correct embedding should be returned

Logs 10:16PM DBG Request received: {"model":"clinicalbert","language":"","translate":false,"n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"repeat_last_n":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":"person with fever","stop":null,"messages":null,"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""} 10:16PM DBG guessDefaultsFromFile: not a GGUF file 10:16PM DBG Parameter Config: &{PredictionOptions:{Model:pytorch_model.bin Language: Translate:false N:0 TopP:0xc0005fe678 TopK:0xc0005fe680 Temperature:0xc0005fe688 Maxtokens:0xc0005fe6c8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0005fe6c0 TypicalP:0xc0005fe6b8 Seed:0xc0005fe6f0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:clinicalbert F16:0xc0005fe670 Threads:0xc0005fe658 Debug:0xc0005fe8d0 Roles:map[] Embeddings:0xc0005fe4e0 Backend:sentencetransformers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[person with fever] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0005fe6b0 MirostatTAU:0xc0005fe698 Mirostat:0xc0005fe690 NGPULayers:0xc0005fe6d0 MMap:0xc0005fe6d8 MMlock:0xc0005fe6d9 LowVRAM:0xc0005fe6d9 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0005fe650 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:} 10:16PM INF Loading model 'pytorch_model.bin' with backend sentencetransformers 10:16PM DBG Loading model in memory from file: /build/models/pytorch_model.bin 10:16PM DBG Loading Model pytorch_model.bin with gRPC (file: /build/models/pytorch_model.bin) (backend: sentencetransformers): {backendString:sentencetransformers model:pytorch_model.bin threads:12 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002d7208 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false} 10:16PM DBG Loading external backend: /build/backend/python/sentencetransformers/run.sh 10:16PM DBG Loading GRPC Process: /build/backend/python/sentencetransformers/run.sh 10:16PM DBG GRPC Service for pytorch_model.bin will be running at: '127.0.0.1:40125' 10:16PM DBG GRPC Service state dir: /tmp/go-processmanager2441360389 10:16PM DBG GRPC Service Started 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stdout Initializing libbackend for build 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stderr /build/backend/python/sentencetransformers/../common/libbackend.sh: line 91: uv: command not found 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stdout virtualenv created 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stdout virtualenv activated 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stdout activated virtualenv has been ensured 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stderr /build/backend/python/sentencetransformers/../common/libbackend.sh: line 97: /build/backend/python/sentencetransformers/venv/bin/activate: No such file or directory 10:16PM DBG GRPC(pytorch_model.bin-127.0.0.1:40125): stderr /build/backend/python/sentencetransformers/../common/libbackend.sh: line 180: exec: python: not found 10:17PM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:40125: connect: connection refused\"" 10:17PM DBG GRPC Service NOT ready 10:17PM ERR Server error error="grpc service not ready" ip=172.17.0.1 latency=40.428444674s method=POST status=500 url=/v1/embeddings

Additional context

Nyralei commented 1 month ago

CPU images don't have Python installed. It's by design

F42J commented 1 month ago

Thanks for the information. If the CPU images don't have python by design, what is the recommended way to run models like the clinicalbert project on a server only equipped with CPU? Is one of the other backends also capable of using these models and doesn't require python?

On Tue, Sep 17, 2024, 00:37 Alexander Izotov @.***> wrote:

CPU images don't have Python installed. It's by design

— Reply to this email directly, view it on GitHub https://github.com/mudler/LocalAI/issues/3583#issuecomment-2354150083, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN2GBNVMHOZEORV3MG5UD3DZW5MSTAVCNFSM6AAAAABOKG5H7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJUGE2TAMBYGM . You are receiving this because you authored the thread.Message ID: @.***>

Nyralei commented 1 month ago

Thanks for the information. If the CPU images don't have python by design, what is the recommended way to run models like the clinicalbert project on a server only equipped with CPU? Is one of the other backends also capable of using these models and doesn't require python? … On Tue, Sep 17, 2024, 00:37 Alexander Izotov @.> wrote: CPU images don't have Python installed. It's by design — Reply to this email directly, view it on GitHub <#3583 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN2GBNVMHOZEORV3MG5UD3DZW5MSTAVCNFSM6AAAAABOKG5H7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJUGE2TAMBYGM . You are receiving this because you authored the thread.Message ID: @.>

You need to add Python and installation of chosen backend, in your case sentencetransformers (https://github.com/mudler/LocalAI/tree/master/backend/python/sentencetransformers), to your Dockerfile. There is install.sh which sets up venv and installs requirements.

F42J commented 1 month ago

Unfortunately i still cannot make it work Once python, uv as well as the sentencetransformers backend are installed (using the backend install script) its fails because of missing the module backend_pb2 It looks like this is a dependency on another backend but i couldn't figure out which backends i have to build additionally.

Nyralei commented 1 month ago

Try this:

FROM localai/localai:latest-cpu

ENV CONDA_DIR /opt/conda
RUN curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o ~/miniconda.sh && \
    /bin/bash ~/miniconda.sh -b -p /opt/conda

ENV PATH $CONDA_DIR/bin:$PATH

RUN pip install grpcio-tools==1.66.0 \
    uv

RUN make -C backend/python/sentencetransformers protogen \
    && make -C backend/python/sentencetransformers

Also specifying local file doesn't seem to work, SentenceTransformer backend receives model_name like this: sentence-transformers/pytorch_model.bin and throws an error:

ERR Server error error="could not load model (no success): Unexpected err=OSError(\"sentence-transformers/pytorch_model.bin is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\\nIf this is a private repository, make sure to pass a token having permission to this repo either by logging in withhuggingface-cli loginor by passingtoken=\"), type(err)=<class 'OSError'>" ip=172.17.0.1 latency=6.441716341s method=POST status=500 url=/embeddings

So for now yaml file should look like this:

name: clinicalbert
backend: sentencetransformers
embeddings: true
parameters:
  model: medicalai/ClinicalBERT

mudler commented 1 month ago

another way would be to use the gpu images which comes with python and the backends already installed. See also: https://localai.io/basics/container/#standard-container-images

F42J commented 1 month ago

Thanks for the help, the gpu images then worked. I had assumed them to be only working for systems with gpus.

The issue pointed out by @Nyralei does occur with the local model, however the modified yaml also doesnt seem to work for me. Is there any incompatibility with the sentence-transformer backend that causes the following message: DBG GRPC(medicalai/ClinicalBERT-127.0.0.1:36351): stderr No sentence-transformers model found with name medicalai/ClinicalBERT. Creating a new one with mean pooling.

(I'm generally not that experienced with this so already thanks a lot for any help)

Nyralei commented 1 month ago

DBG GRPC(medicalai/ClinicalBERT-127.0.0.1:36351): stderr No sentence-transformers model found with name medicalai/ClinicalBERT. Creating a new one with mean pooling.

This message appears when sentence-transformers starts to download the model. Is there network activity? Did it create the directory with model in models path?

mudler / LocalAI

exec: python: not found #3583