mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
https://localai.io
MIT License
26.62k stars 1.99k forks source link

When using the smart agent mode from the telegram-bot example, it throws an error #2639

Open greygoo opened 5 months ago

greygoo commented 5 months ago

LocalAI version:

quay.io/go-skynet/local-ai:v1.18.0-ffmpeg localai/localai:v2.17.1-ffmpeg

Environment, CPU architecture, OS, and Version:

rtx4060/ryzen5700/32G

Describe the bug

When using the smart agent mode from the telegram-bot example, it throws an error with both tested versions:

api-1                 | 2:47PM DBG Request received: {"model":"gpt-4","language":"","n":0,"top_p":null,"top_k":null,"temperature":0.1,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.\n                    The assistant replies with the action \"ingest\" when there is an url to a sitemap to ingest memories from.\nFor creating a picture, the assistant replies with \"generate_picture\" and a detailed caption, enhancing it with as much detail as possible.\nFor searching the internet with a query, the assistant replies with the action \"search_internet\" and the query to search.\nThe assistant replies with the action \"save_file\", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.\nThe assistant replies with the action \"save_memory\" and the string to remember or store an information that thinks it is relevant permanently.\nThe assistant replies with the action \"search_memory\" for searching between its memories with a query term.\nThe assistant for solving complex tasks that involves calling more functions in sequence, replies with the action \"plan\".\nFor replying to the user, the assistant replies with the action \"reply\" and the reply to the user directly when there is nothing to do.\n\n\n    This is the user input: @Loriaai_bot plan a trip from berlin to new york\n    Decide now the function to call and give a detailed explaination\n"}],"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
api-1                 | 2:47PM DBG guessDefaultsFromFile: not a GGUF file
api-1                 | 2:47PM DBG Configuration read: &{PredictionOptions:{Model:gpt-4 Language: N:0 TopP:0xc000391ff8 TopK:0xc000412010 Temperature:0xc000391f00 Maxtokens:0xc000412068 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000412060 TypicalP:0xc000412058 Seed:0xc0004120a0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0xc000391ff0 Threads:0xc000391fc8 Debug:0xc000412098 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000412050 MirostatTAU:0xc000412038 Mirostat:0xc000412030 NGPULayers:0xc000412090 MMap:0xc000412098 MMlock:0xc000412099 LowVRAM:0xc000412099 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc000391fc0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 2:47PM DBG Parameters: &{PredictionOptions:{Model:gpt-4 Language: N:0 TopP:0xc000391ff8 TopK:0xc000412010 Temperature:0xc000391f00 Maxtokens:0xc000412068 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000412060 TypicalP:0xc000412058 Seed:0xc0004120a0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0xc000391ff0 Threads:0xc000391fc8 Debug:0xc000412098 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000412050 MirostatTAU:0xc000412038 Mirostat:0xc000412030 NGPULayers:0xc000412090 MMap:0xc000412098 MMlock:0xc000412099 LowVRAM:0xc000412099 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc000391fc0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 2:47PM DBG Prompt (before templating): Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to new york
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | 
api-1                 | 2:47PM DBG Prompt (after templating): Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to new york
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | 
api-1                 | 2:47PM DBG Loading from the following backends (in order): [llama-cpp llama-ggml gpt4all llama-cpp-fallback piper rwkv whisper stablediffusion huggingface bert-embeddings /build/backend/python/sentencetransformers/run.sh /build/backend/python/transformers-musicgen/run.sh /build/backend/python/coqui/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/transformers/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/exllama2/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/petals/run.sh /build/backend/python/exllama/run.sh /build/backend/python/vllm/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/bark/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/mamba/run.sh]
api-1                 | 2:47PM INF Trying to load the model 'gpt-4' with the backend '[llama-cpp llama-ggml gpt4all llama-cpp-fallback piper rwkv whisper stablediffusion huggingface bert-embeddings /build/backend/python/sentencetransformers/run.sh /build/backend/python/transformers-musicgen/run.sh /build/backend/python/coqui/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/transformers/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/exllama2/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/petals/run.sh /build/backend/python/exllama/run.sh /build/backend/python/vllm/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/bark/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/mamba/run.sh]'
api-1                 | 2:47PM INF [llama-cpp] Attempting to load
api-1                 | 2:47PM INF Loading model 'gpt-4' with backend llama-cpp
api-1                 | 2:47PM DBG Loading model in memory from file: /models/gpt-4
api-1                 | 2:47PM DBG Loading Model gpt-4 with gRPC (file: /models/gpt-4) (backend: llama-cpp): {backendString:llama-cpp model:gpt-4 threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00036b448 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 2:47PM INF [llama-cpp] attempting to load with AVX2 variant
api-1                 | 2:47PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-avx2
api-1                 | 2:47PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:41049'
api-1                 | 2:47PM DBG GRPC Service state dir: /tmp/go-processmanager141132343
api-1                 | 2:47PM DBG GRPC Service Started
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stdout Server listening on 127.0.0.1:41049
api-1                 | 2:47PM DBG GRPC Service Ready
api-1                 | 2:47PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:1771207454 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stdout {"timestamp":1719154031,"level":"ERROR","function":"load_model","line":464,"message":"unable to load model","model":"/models/gpt-4"}
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stderr llama_model_load: error loading model: llama_model_loader: failed to load model from /models/gpt-4
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stderr 
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stderr llama_load_model_from_file: failed to load model
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41049): stderr llama_init_from_gpt_params: error: failed to load model '/models/gpt-4'
api-1                 | 2:47PM INF [llama-cpp] Fails: could not load model: rpc error: code = Canceled desc = 
api-1                 | 2:47PM INF [llama-ggml] Attempting to load
api-1                 | 2:47PM INF Loading model 'gpt-4' with backend llama-ggml
api-1                 | 2:47PM DBG Loading model in memory from file: /models/gpt-4
api-1                 | 2:47PM DBG Loading Model gpt-4 with gRPC (file: /models/gpt-4) (backend: llama-ggml): {backendString:llama-ggml model:gpt-4 threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00036b448 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 2:47PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-ggml
api-1                 | 2:47PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:41557'
api-1                 | 2:47PM DBG GRPC Service state dir: /tmp/go-processmanager2522970581
api-1                 | 2:47PM DBG GRPC Service Started
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr 2024/06/23 14:47:11 gRPC Server listening at 127.0.0.1:41557
api-1                 | 2:47PM DBG GRPC Service Ready
api-1                 | 2:47PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:1771207454 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr create_gpt_params: loading model /models/gpt-4
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr error loading model: failed to open /models/gpt-4: No such file or directory
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr llama_load_model_from_file: failed to load model
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr llama_init_from_gpt_params: error: failed to load model '/models/gpt-4'
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:41557): stderr load_binding_model: error: unable to load model
api-1                 | 2:47PM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
api-1                 | 2:47PM INF [gpt4all] Attempting to load
api-1                 | 2:47PM INF Loading model 'gpt-4' with backend gpt4all
api-1                 | 2:47PM DBG Loading model in memory from file: /models/gpt-4
api-1                 | 2:47PM DBG Loading Model gpt-4 with gRPC (file: /models/gpt-4) (backend: gpt4all): {backendString:gpt4all model:gpt-4 threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00036b448 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 2:47PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/gpt4all
api-1                 | 2:47PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:46055'
api-1                 | 2:47PM DBG GRPC Service state dir: /tmp/go-processmanager1176638387
api-1                 | 2:47PM DBG GRPC Service Started
api-1                 | 2:47PM DBG GRPC(gpt-4-127.0.0.1:46055): stderr 2024/06/23 14:47:14 gRPC Server listening at 127.0.0.1:46055
api-1                 | 2:47PM DBG GRPC Service Ready
api-1                 | 2:47PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:1771207454 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
... removed lines due to to long comment
chatgpt_telegram_bot  | [llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = 
chatgpt_telegram_bot  | [piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/gpt-4 (should end with .onnx)
chatgpt_telegram_bot  | [rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
chatgpt_telegram_bot  | [whisper]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory
chatgpt_telegram_bot  | [stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory
chatgpt_telegram_bot  | [huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided
chatgpt_telegram_bot  | [bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS {"error":{"code":500,"message":"could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = \n[llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model\n[gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model\n[llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = \n[piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/gpt-4 (should end with .onnx)\n[rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n[whisper]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory\n[stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory\n[huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided\n[bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS","type":""}} 500 {'error': {'code': 500, 'message': 'could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = \n[llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model\n[gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model\n[llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = \n[piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/gpt-4 (should end with .onnx)\n[rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n[whisper]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory\n[stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/gpt-4: no such file or directory\n[huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided\n[bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS', 'type': ''}} {'Date': 'Sun, 23 Jun 2024 14:47:29 GMT', 'Content-Type': 'application/json', 'Content-Length': '4947'}
chatgpt_telegram_bot  | Exception while handling an update:
chatgpt_telegram_bot  | Traceback (most recent call last):
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/telegram/ext/_application.py", line 1104, in process_update
chatgpt_telegram_bot  |     await coroutine
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/telegram/ext/_handler.py", line 141, in handle_update
chatgpt_telegram_bot  |     return await self.callback(update, context)
chatgpt_telegram_bot  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  |   File "/code/bot/bot.py", line 315, in message_handle
chatgpt_telegram_bot  |     await smart_agent_handle(update, context, message=message)
chatgpt_telegram_bot  |   File "/code/bot/bot.py", line 268, in smart_agent_handle
chatgpt_telegram_bot  |     conversation_history = localagi.evaluate(
chatgpt_telegram_bot  |                            ^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/localagi/localagi.py", line 504, in evaluate
chatgpt_telegram_bot  |     self.reasoning_callback(action["action"], action["detailed_reasoning"])
chatgpt_telegram_bot  |                                               ~~~~~~^^^^^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  | KeyError: 'detailed_reasoning'

To Reproduce

Expected behavior

Getting a response with an answer

Logs

In description.

Additional context

greygoo commented 5 months ago

I got the GPT-4 model working, now it starts making a plan, and then tries to load a bunch of models, that are still missing. Log when GPt-4 model itself is working:

api-1                 | 3:53PM DBG Request received: {"model":"gpt-4","language":"","n":0,"top_p":null,"top_k":null,"temperature":0.1,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.\n                    The assistant replies with the action \"ingest\" when there is an url to a sitemap to ingest memories from.\nFor creating a picture, the assistant replies with \"generate_picture\" and a detailed caption, enhancing it with as much detail as possible.\nFor searching the internet with a query, the assistant replies with the action \"search_internet\" and the query to search.\nThe assistant replies with the action \"save_file\", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.\nThe assistant replies with the action \"save_memory\" and the string to remember or store an information that thinks it is relevant permanently.\nThe assistant replies with the action \"search_memory\" for searching between its memories with a query term.\nThe assistant for solving complex tasks that involves calling more functions in sequence, replies with the action \"plan\".\nFor replying to the user, the assistant replies with the action \"reply\" and the reply to the user directly when there is nothing to do.\n\n\n    This is the user input: @Loriaai_bot plan a trip from berlin to leipzig\n    Decide now the function to call and give a detailed explaination\n"}],"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
api-1                 | 3:53PM DBG guessDefaultsFromFile: template already set name=gpt-4
api-1                 | 3:53PM DBG Configuration read: &{PredictionOptions:{Model:b5869d55688a529c3738cb044e92c331 Language: N:0 TopP:0xc0005017a8 TopK:0xc0005017b0 Temperature:0xc000894230 Maxtokens:0xc0005017e8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0005017e0 TypicalP:0xc0005017d8 Seed:0xc000501800 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0005017a0 Threads:0xc000501798 Debug:0xc000894370 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:{{.Input -}}
api-1                 | <|im_start|>assistant
api-1                 |  ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
api-1                 | {{- if .FunctionCall }}
api-1                 | <tool_call>
api-1                 | {{- else if eq .RoleName "tool" }}
api-1                 | <tool_response>
api-1                 | {{- end }}
api-1                 | {{- if .Content}}
api-1                 | {{.Content }}
api-1                 | {{- end }}
api-1                 | {{- if .FunctionCall}}
api-1                 | {{toJson .FunctionCall}}
api-1                 | {{- end }}
api-1                 | {{- if .FunctionCall }}
api-1                 | </tool_call>
api-1                 | {{- else if eq .RoleName "tool" }}
api-1                 | </tool_response>
api-1                 | {{- end }}<|im_end|>
api-1                 |  Completion:{{.Input}}
api-1                 |  Edit: Functions:<|im_start|>system
api-1                 | You are a function calling AI model.
api-1                 | Here are the available tools:
api-1                 | <tools>
api-1                 | {{range .Functions}}
api-1                 | {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
api-1                 | {{end}}
api-1                 | </tools>
api-1                 | You should call the tools provided to you sequentially
api-1                 | Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
api-1                 | <scratchpad>
api-1                 | {step-by-step reasoning and plan in bullet points}
api-1                 | </scratchpad>
api-1                 | For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
api-1                 | <tool_call>
api-1                 | {"arguments": <args-dict>, "name": <function-name>}
api-1                 | </tool_call><|im_end|>
api-1                 | {{.Input -}}
api-1                 | <|im_start|>assistant UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:true NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)<tool_call>(.*?)</tool_call> (?s)<tool_call>(.*?)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:'([^']*?)' Value:_DQUOTE_${1}_DQUOTE_} {Key:\\" Value:__TEMP_QUOTE__} {Key:' Value:'} {Key:_DQUOTE_ Value:"} {Key:__TEMP_QUOTE__ Value:"} {Key:(?s)<scratchpad>.*</scratchpad> Value:}] ReplaceLLMResult:[{Key:(?s)<scratchpad>.*</scratchpad> Value:}] CaptureLLMResult:[] FunctionName:true} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0005017d0 MirostatTAU:0xc0005017c8 Mirostat:0xc0005017c0 NGPULayers:0xc0005017f0 MMap:0xc0005016a8 MMlock:0xc0005017f9 LowVRAM:0xc0005017f9 Grammar: StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|>] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0005016b0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 3:53PM DBG Parameters: &{PredictionOptions:{Model:b5869d55688a529c3738cb044e92c331 Language: N:0 TopP:0xc0005017a8 TopK:0xc0005017b0 Temperature:0xc000894230 Maxtokens:0xc0005017e8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0005017e0 TypicalP:0xc0005017d8 Seed:0xc000501800 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0005017a0 Threads:0xc000501798 Debug:0xc000894370 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:{{.Input -}}
api-1                 | <|im_start|>assistant
api-1                 |  ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
api-1                 | {{- if .FunctionCall }}
api-1                 | <tool_call>
api-1                 | {{- else if eq .RoleName "tool" }}
api-1                 | <tool_response>
api-1                 | {{- end }}
api-1                 | {{- if .Content}}
api-1                 | {{.Content }}
api-1                 | {{- end }}
api-1                 | {{- if .FunctionCall}}
api-1                 | {{toJson .FunctionCall}}
api-1                 | {{- end }}
api-1                 | {{- if .FunctionCall }}
api-1                 | </tool_call>
api-1                 | {{- else if eq .RoleName "tool" }}
api-1                 | </tool_response>
api-1                 | {{- end }}<|im_end|>
api-1                 |  Completion:{{.Input}}
api-1                 |  Edit: Functions:<|im_start|>system
api-1                 | You are a function calling AI model.
api-1                 | Here are the available tools:
api-1                 | <tools>
api-1                 | {{range .Functions}}
api-1                 | {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
api-1                 | {{end}}
api-1                 | </tools>
api-1                 | You should call the tools provided to you sequentially
api-1                 | Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
api-1                 | <scratchpad>
api-1                 | {step-by-step reasoning and plan in bullet points}
api-1                 | </scratchpad>
api-1                 | For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
api-1                 | <tool_call>
api-1                 | {"arguments": <args-dict>, "name": <function-name>}
api-1                 | </tool_call><|im_end|>
api-1                 | {{.Input -}}
api-1                 | <|im_start|>assistant UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:true NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)<tool_call>(.*?)</tool_call> (?s)<tool_call>(.*?)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:'([^']*?)' Value:_DQUOTE_${1}_DQUOTE_} {Key:\\" Value:__TEMP_QUOTE__} {Key:' Value:'} {Key:_DQUOTE_ Value:"} {Key:__TEMP_QUOTE__ Value:"} {Key:(?s)<scratchpad>.*</scratchpad> Value:}] ReplaceLLMResult:[{Key:(?s)<scratchpad>.*</scratchpad> Value:}] CaptureLLMResult:[] FunctionName:true} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0005017d0 MirostatTAU:0xc0005017c8 Mirostat:0xc0005017c0 NGPULayers:0xc0005017f0 MMap:0xc0005016a8 MMlock:0xc0005017f9 LowVRAM:0xc0005017f9 Grammar: StopWords:[<|im_end|> <dummy32000> </tool_call> <|eot_id|> <|end_of_text|>] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0005016b0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 3:53PM DBG templated message for chat: <|im_start|>user
api-1                 | Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | <|im_end|>
api-1                 | 
api-1                 | 3:53PM DBG Prompt (before templating): <|im_start|>user
api-1                 | Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | <|im_end|>
api-1                 | 
api-1                 | 3:53PM DBG Template found, input modified to: <|im_start|>user
api-1                 | Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | <|im_end|>
api-1                 | <|im_start|>assistant
api-1                 | 
api-1                 | 3:53PM DBG Prompt (after templating): <|im_start|>user
api-1                 | Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |                     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | 
api-1                 |     This is the user input: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 |     Decide now the function to call and give a detailed explaination
api-1                 | <|im_end|>
api-1                 | <|im_start|>assistant
api-1                 | 
api-1                 | 3:53PM DBG Model already loaded in memory: b5869d55688a529c3738cb044e92c331
api-1                 | 3:53PM DBG Model 'b5869d55688a529c3738cb044e92c331' already loaded
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719157998,"level":"INFO","function":"launch_slot_with_data","line":884,"message":"slot is processing task","slot_id":0,"task_id":360}
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719157998,"level":"INFO","function":"update_slots","line":1783,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":360,"p0":0}
api-1                 | 3:53PM INF Success ip=127.0.0.1 latency="30.451µs" method=GET status=200 url=/readyz
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719158036,"level":"INFO","function":"print_timings","line":327,"message":"prompt eval time     =    8914.40 ms /   275 tokens (   32.42 ms per token,    30.85 tokens per second)","slot_id":0,"task_id":360,"t_prompt_processing":8914.397,"num_prompt_tokens_processed":275,"t_token":32.41598909090909,"n_tokens_second":30.848973856560345}
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719158036,"level":"INFO","function":"print_timings","line":341,"message":"generation eval time =   28349.02 ms /   200 runs   (  141.75 ms per token,     7.05 tokens per second)","slot_id":0,"task_id":360,"t_token_generation":28349.018,"n_decoded":200,"t_token":141.74509,"n_tokens_second":7.054918092753688}
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719158036,"level":"INFO","function":"print_timings","line":351,"message":"          total time =   37263.42 ms","slot_id":0,"task_id":360,"t_prompt_processing":8914.397,"t_token_generation":28349.018,"t_total":37263.415}
api-1                 | 3:53PM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:35991): stdout {"timestamp":1719158036,"level":"INFO","function":"update_slots","line":1594,"message":"slot released","slot_id":0,"task_id":360,"n_ctx":8192,"n_past":474,"n_system_tokens":0,"n_cache_tokens":475,"truncated":false}
api-1                 | 3:53PM DBG Response: {"created":1719157755,"object":"chat.completion","id":"02605872-266f-47ee-88bd-6489f8f7ca64","model":"gpt-4","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"To plan a trip from Berlin to Leipzig, I will follow these steps:\n\n1. Search for flights: I will use the action \"search_internet\" and the query \"flights from Berlin to Leipzig\" to find the available flight options and their prices.\n\n2. Find transportation options: I will use the action \"search_internet\" and the query \"transportation from Berlin to Leipzig\" to gather information on train, bus, and driving options.\n\n3. Research accommodations: I will use the action \"search_internet\" and the query \"hotels in Leipzig\" to find suitable accommodation options based on the traveler's preferences and budget.\n\n4. Plan an itinerary: I will use the action \"plan\" to create a detailed itinerary for the trip, including suggested activities and attractions to visit in Leipzig.\n\n5. Provide recommendations: I will reply to the user with the gathered information, including flight options, transportation details, hotel suggestions, and an itinerary for their trip from Berlin to Leipzig."}}],"usage":{"prompt_tokens":275,"completion_tokens":200,"total_tokens":475}}
api-1                 | 3:53PM INF Success ip=172.25.0.3 latency=37.269674995s method=POST status=200 url=/v1/chat/completions
chatgpt_telegram_bot  | 2024-06-23 15:53:56.144 | INFO     | localagi.localagi:evaluate:496 - ==> Critic: To plan a trip from Berlin to Leipzig, I will follow these steps:
chatgpt_telegram_bot  | 
chatgpt_telegram_bot  | 1. Search for flights: I will use the action "search_internet" and the query "flights from Berlin to Leipzig" to find the available flight options and their prices.
chatgpt_telegram_bot  | 
chatgpt_telegram_bot  | 2. Find transportation options: I will use the action "search_internet" and the query "transportation from Berlin to Leipzig" to gather information on train, bus, and driving options.
chatgpt_telegram_bot  | 
chatgpt_telegram_bot  | 3. Research accommodations: I will use the action "search_internet" and the query "hotels in Leipzig" to find suitable accommodation options based on the traveler's preferences and budget.
chatgpt_telegram_bot  | 
chatgpt_telegram_bot  | 4. Plan an itinerary: I will use the action "plan" to create a detailed itinerary for the trip, including suggested activities and attractions to visit in Leipzig.
chatgpt_telegram_bot  | 
chatgpt_telegram_bot  | 5. Provide recommendations: I will reply to the user with the gathered information, including flight options, transportation details, hotel suggestions, and an itinerary for their trip from Berlin to Leipzig.
api-1                 | 3:53PM DBG Request received: {"model":"functions","language":"","n":0,"top_p":null,"top_k":null,"temperature":0.1,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.\n    The assistant replies with the action \"ingest\" when there is an url to a sitemap to ingest memories from.\nFor creating a picture, the assistant replies with \"generate_picture\" and a detailed caption, enhancing it with as much detail as possible.\nFor searching the internet with a query, the assistant replies with the action \"search_internet\" and the query to search.\nThe assistant replies with the action \"save_file\", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.\nThe assistant replies with the action \"save_memory\" and the string to remember or store an information that thinks it is relevant permanently.\nThe assistant replies with the action \"search_memory\" for searching between its memories with a query term.\nThe assistant for solving complex tasks that involves calling more functions in sequence, replies with the action \"plan\".\nFor replying to the user, the assistant replies with the action \"reply\" and the reply to the user directly when there is nothing to do.\n"},{"role":"user","content":"Request: @Loriaai_bot plan a trip from berlin to leipzig\nTo plan a trip from Berlin to Leipzig, I will follow these steps:\n\n1. Search for flights: I will use the action \"search_internet\" and the query \"flights from Berlin to Leipzig\" to find the available flight options and their prices.\n\n2. Find transportation options: I will use the action \"search_internet\" and the query \"transportation from Berlin to Leipzig\" to gather information on train, bus, and driving options.\n\n3. Research accommodations: I will use the action \"search_internet\" and the query \"hotels in Leipzig\" to find suitable accommodation options based on the traveler's preferences and budget.\n\n4. Plan an itinerary: I will use the action \"plan\" to create a detailed itinerary for the trip, including suggested activities and attractions to visit in Leipzig.\n\n5. Provide recommendations: I will reply to the user with the gathered information, including flight options, transportation details, hotel suggestions, and an itinerary for their trip from Berlin to Leipzig.\nFunction call: "}],"functions":[{"name":"intent","description":"Decide to do an action.","parameters":{"properties":{"action":{"description":"user intent","enum":["ingest","generate_picture","search_internet","save_file","save_memory","search_memory","plan","reply"],"type":"string"},"confidence":{"description":"confidence of the action","type":"number"},"detailed_reasoning":{"description":"reasoning behind the intent","type":"string"}},"required":["action"],"type":"object"}}],"function_call":{"name":"intent"},"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
api-1                 | 3:53PM DBG guessDefaultsFromFile: not a GGUF file
api-1                 | 3:53PM DBG Configuration read: &{PredictionOptions:{Model:functions Language: N:0 TopP:0xc000014820 TopK:0xc000014828 Temperature:0xc000014648 Maxtokens:0xc000014860 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000014858 TypicalP:0xc000014850 Seed:0xc000014878 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0xc0000147d8 Threads:0xc0000147d0 Debug:0xc000014870 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString:intent ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000014848 MirostatTAU:0xc000014840 Mirostat:0xc000014838 NGPULayers:0xc000014868 MMap:0xc000014870 MMlock:0xc000014871 LowVRAM:0xc000014871 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0000147a8 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 3:53PM DBG Response needs to process functions
api-1                 | 3:53PM DBG Parameters: &{PredictionOptions:{Model:functions Language: N:0 TopP:0xc000014820 TopK:0xc000014828 Temperature:0xc000014648 Maxtokens:0xc000014860 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000014858 TypicalP:0xc000014850 Seed:0xc000014878 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0xc0000147d8 Threads:0xc0000147d0 Debug:0xc000014870 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString:intent ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000014848 MirostatTAU:0xc000014840 Mirostat:0xc000014838 NGPULayers:0xc000014868 MMap:0xc000014870 MMlock:0xc000014871 LowVRAM:0xc000014871 Grammar:space ::= " "?
api-1                 | freestring ::= (
api-1                 |                         [^\x00] |
api-1                 |                         "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
api-1                 |                   )* space
api-1                 | root-0-arguments-action ::= "\"ingest\"" | "\"generate_picture\"" | "\"search_internet\"" | "\"save_file\"" | "\"save_memory\"" | "\"search_memory\"" | "\"plan\"" | "\"reply\""
api-1                 | root-0-arguments ::= "{" space "\"action\"" space ":" space root-0-arguments-action "," space "\"confidence\"" space ":" space number "," space "\"detailed_reasoning\"" space ":" space string "}" space
api-1                 | root-0-function ::= "\"intent\""
api-1                 | root ::= root-0
api-1                 | number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? space
api-1                 | string ::= "\"" (
api-1                 |                         [^"\\] |
api-1                 |                         "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
api-1                 |                   )* "\"" space
api-1                 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0000147a8 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}
api-1                 | 3:53PM DBG Prompt (before templating): Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | Request: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 | To plan a trip from Berlin to Leipzig, I will follow these steps:
api-1                 | 
api-1                 | 1. Search for flights: I will use the action "search_internet" and the query "flights from Berlin to Leipzig" to find the available flight options and their prices.
api-1                 | 
api-1                 | 2. Find transportation options: I will use the action "search_internet" and the query "transportation from Berlin to Leipzig" to gather information on train, bus, and driving options.
api-1                 | 
api-1                 | 3. Research accommodations: I will use the action "search_internet" and the query "hotels in Leipzig" to find suitable accommodation options based on the traveler's preferences and budget.
api-1                 | 
api-1                 | 4. Plan an itinerary: I will use the action "plan" to create a detailed itinerary for the trip, including suggested activities and attractions to visit in Leipzig.
api-1                 | 
api-1                 | 5. Provide recommendations: I will reply to the user with the gathered information, including flight options, transportation details, hotel suggestions, and an itinerary for their trip from Berlin to Leipzig.
api-1                 | Function call: 
api-1                 | 3:53PM DBG Prompt (after templating): Transcript of AI assistant responding to user requests. Replies with the action to perform and the reasoning.
api-1                 |     The assistant replies with the action "ingest" when there is an url to a sitemap to ingest memories from.
api-1                 | For creating a picture, the assistant replies with "generate_picture" and a detailed caption, enhancing it with as much detail as possible.
api-1                 | For searching the internet with a query, the assistant replies with the action "search_internet" and the query to search.
api-1                 | The assistant replies with the action "save_file", the filename and content to save for writing a file to disk permanently. This can be used to store the result of complex actions locally.
api-1                 | The assistant replies with the action "save_memory" and the string to remember or store an information that thinks it is relevant permanently.
api-1                 | The assistant replies with the action "search_memory" for searching between its memories with a query term.
api-1                 | The assistant for solving complex tasks that involves calling more functions in sequence, replies with the action "plan".
api-1                 | For replying to the user, the assistant replies with the action "reply" and the reply to the user directly when there is nothing to do.
api-1                 | 
api-1                 | Request: @Loriaai_bot plan a trip from berlin to leipzig
api-1                 | To plan a trip from Berlin to Leipzig, I will follow these steps:
api-1                 | 
api-1                 | 1. Search for flights: I will use the action "search_internet" and the query "flights from Berlin to Leipzig" to find the available flight options and their prices.
api-1                 | 
api-1                 | 2. Find transportation options: I will use the action "search_internet" and the query "transportation from Berlin to Leipzig" to gather information on train, bus, and driving options.
api-1                 | 
api-1                 | 3. Research accommodations: I will use the action "search_internet" and the query "hotels in Leipzig" to find suitable accommodation options based on the traveler's preferences and budget.
api-1                 | 
api-1                 | 4. Plan an itinerary: I will use the action "plan" to create a detailed itinerary for the trip, including suggested activities and attractions to visit in Leipzig.
api-1                 | 
api-1                 | 5. Provide recommendations: I will reply to the user with the gathered information, including flight options, transportation details, hotel suggestions, and an itinerary for their trip from Berlin to Leipzig.
api-1                 | Function call: 
api-1                 | 3:53PM DBG Grammar: space ::= " "?
api-1                 | freestring ::= (
api-1                 |                         [^\x00] |
api-1                 |                         "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
api-1                 |                   )* space
api-1                 | root-0-arguments-action ::= "\"ingest\"" | "\"generate_picture\"" | "\"search_internet\"" | "\"save_file\"" | "\"save_memory\"" | "\"search_memory\"" | "\"plan\"" | "\"reply\""
api-1                 | root-0-arguments ::= "{" space "\"action\"" space ":" space root-0-arguments-action "," space "\"confidence\"" space ":" space number "," space "\"detailed_reasoning\"" space ":" space string "}" space
api-1                 | root-0-function ::= "\"intent\""
api-1                 | root ::= root-0
api-1                 | number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? space
api-1                 | string ::= "\"" (
api-1                 |                         [^"\\] |
api-1                 |                         "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
api-1                 |                   )* "\"" space
api-1                 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space
api-1                 | 3:53PM DBG Loading from the following backends (in order): [llama-cpp llama-ggml gpt4all llama-cpp-fallback stablediffusion whisper piper rwkv huggingface bert-embeddings /build/backend/python/exllama2/run.sh /build/backend/python/coqui/run.sh /build/backend/python/bark/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/exllama/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/transformers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/vllm/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/petals/run.sh /build/backend/python/mamba/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/transformers-musicgen/run.sh]
api-1                 | 3:53PM INF Trying to load the model 'functions' with the backend '[llama-cpp llama-ggml gpt4all llama-cpp-fallback stablediffusion whisper piper rwkv huggingface bert-embeddings /build/backend/python/exllama2/run.sh /build/backend/python/coqui/run.sh /build/backend/python/bark/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/exllama/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/transformers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/vllm/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/petals/run.sh /build/backend/python/mamba/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/transformers-musicgen/run.sh]'
api-1                 | 3:53PM INF [llama-cpp] Attempting to load
api-1                 | 3:53PM INF Loading model 'functions' with backend llama-cpp
api-1                 | 3:53PM DBG Loading model in memory from file: /models/functions
api-1                 | 3:53PM DBG Loading Model functions with gRPC (file: /models/functions) (backend: llama-cpp): {backendString:llama-cpp model:functions threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002a1d48 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 3:53PM INF [llama-cpp] attempting to load with AVX2 variant
api-1                 | 3:53PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-avx2
api-1                 | 3:53PM DBG GRPC Service for functions will be running at: '127.0.0.1:41133'
api-1                 | 3:53PM DBG GRPC Service state dir: /tmp/go-processmanager2846060746
api-1                 | 3:53PM DBG GRPC Service Started
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stdout Server listening on 127.0.0.1:41133
api-1                 | 3:53PM DBG GRPC Service Ready
api-1                 | 3:53PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:functions ContextSize:512 Seed:1342134127 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/functions Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stdout {"timestamp":1719158038,"level":"ERROR","function":"load_model","line":464,"message":"unable to load model","model":"/models/functions"}
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stderr llama_model_load: error loading model: llama_model_loader: failed to load model from /models/functions
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stderr 
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stderr llama_load_model_from_file: failed to load model
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:41133): stderr llama_init_from_gpt_params: error: failed to load model '/models/functions'
api-1                 | 3:53PM INF [llama-cpp] Fails: could not load model: rpc error: code = Canceled desc = 
api-1                 | 3:53PM INF [llama-ggml] Attempting to load
api-1                 | 3:53PM INF Loading model 'functions' with backend llama-ggml
api-1                 | 3:53PM DBG Loading model in memory from file: /models/functions
api-1                 | 3:53PM DBG Loading Model functions with gRPC (file: /models/functions) (backend: llama-ggml): {backendString:llama-ggml model:functions threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002a1d48 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 3:53PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-ggml
api-1                 | 3:53PM DBG GRPC Service for functions will be running at: '127.0.0.1:45783'
api-1                 | 3:53PM DBG GRPC Service state dir: /tmp/go-processmanager3524024141
api-1                 | 3:53PM DBG GRPC Service Started
api-1                 | 3:53PM DBG GRPC(functions-127.0.0.1:45783): stderr 2024/06/23 15:53:58 gRPC Server listening at 127.0.0.1:45783
api-1                 | 3:54PM DBG GRPC Service Ready
api-1                 | 3:54PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:functions ContextSize:512 Seed:1342134127 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/functions Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:45783): stderr create_gpt_params: loading model /models/functions
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:45783): stderr error loading model: failed to open /models/functions: No such file or directory
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:45783): stderr llama_load_model_from_file: failed to load model
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:45783): stderr llama_init_from_gpt_params: error: failed to load model '/models/functions'
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:45783): stderr load_binding_model: error: unable to load model
api-1                 | 3:54PM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
api-1                 | 3:54PM INF [gpt4all] Attempting to load
api-1                 | 3:54PM INF Loading model 'functions' with backend gpt4all
api-1                 | 3:54PM DBG Loading model in memory from file: /models/functions
api-1                 | 3:54PM DBG Loading Model functions with gRPC (file: /models/functions) (backend: gpt4all): {backendString:gpt4all model:functions threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002a1d48 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1                 | 3:54PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/gpt4all
api-1                 | 3:54PM DBG GRPC Service for functions will be running at: '127.0.0.1:44461'
api-1                 | 3:54PM DBG GRPC Service state dir: /tmp/go-processmanager2854971995
api-1                 | 3:54PM DBG GRPC Service Started
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:44461): stderr 2024/06/23 15:54:00 gRPC Server listening at 127.0.0.1:44461
api-1                 | 3:54PM DBG GRPC Service Ready
api-1                 | 3:54PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:functions ContextSize:512 Seed:1342134127 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/functions Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1                 | 3:54PM DBG GRPC(functions-127.0.0.1:44461): stderr load_model: error 'No such file or directory'
...
api-1                 | 3:54PM ERR Server error error="could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = \n[llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model\n[gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model\n[llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = \n[stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[whisper]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/functions (should end with .onnx)\n[rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n[huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided\n[bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model\n[/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS" ip=172.25.0.3 latency=20.104160287s method=POST status=500 url=/v1/chat/completions
chatgpt_telegram_bot  | 2024-06-23 15:54:16.251 | ERROR    | localagi.localagi:evaluate:499 - ==> error: 
chatgpt_telegram_bot  | 2024-06-23 15:54:16.251 | ERROR    | localagi.localagi:evaluate:500 - could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = 
chatgpt_telegram_bot  | [llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = 
chatgpt_telegram_bot  | [stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory
chatgpt_telegram_bot  | [whisper]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory
chatgpt_telegram_bot  | [piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/functions (should end with .onnx)
chatgpt_telegram_bot  | [rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
chatgpt_telegram_bot  | [huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided
chatgpt_telegram_bot  | [bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model
chatgpt_telegram_bot  | [/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS
chatgpt_telegram_bot  | [/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS {"error":{"code":500,"message":"could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = \n[llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model\n[gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model\n[llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = \n[stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[whisper]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/functions (should end with .onnx)\n[rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n[huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided\n[bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model\n[/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS","type":""}} 500 {'error': {'code': 500, 'message': 'could not load model - all backends returned error: [llama-cpp]: could not load model: rpc error: code = Canceled desc = \n[llama-ggml]: could not load model: rpc error: code = Unknown desc = failed loading model\n[gpt4all]: could not load model: rpc error: code = Unknown desc = failed loading model\n[llama-cpp-fallback]: could not load model: rpc error: code = Canceled desc = \n[stablediffusion]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[whisper]: could not load model: rpc error: code = Unknown desc = stat /models/functions: no such file or directory\n[piper]: could not load model: rpc error: code = Unknown desc = unsupported model type /models/functions (should end with .onnx)\n[rwkv]: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n[huggingface]: could not load model: rpc error: code = Unknown desc = no huggingface token provided\n[bert-embeddings]: could not load model: rpc error: code = Unknown desc = failed loading model\n[/build/backend/python/exllama2/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama2/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/coqui/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/bark/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/bark/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/diffusers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/diffusers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/exllama/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/exllama/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/openvoice/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/openvoice/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/autogptq/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/autogptq/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vllm/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vllm/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/sentencetransformers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/sentencetransformers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/petals/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/petals/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/mamba/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/mamba/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/rerankers/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/rerankers/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/parler-tts/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/parler-tts/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/vall-e-x/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/vall-e-x/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS\n[/build/backend/python/transformers-musicgen/run.sh]: grpc process not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/transformers-musicgen/run.sh. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS', 'type': ''}} {'Date': 'Sun, 23 Jun 2024 15:54:15 GMT', 'Content-Type': 'application/json', 'Content-Length': '4959'}
chatgpt_telegram_bot  | Exception while handling an update:
chatgpt_telegram_bot  | Traceback (most recent call last):
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/telegram/ext/_application.py", line 1104, in process_update
chatgpt_telegram_bot  |     await coroutine
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/telegram/ext/_handler.py", line 141, in handle_update
chatgpt_telegram_bot  |     return await self.callback(update, context)
chatgpt_telegram_bot  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  |   File "/code/bot/bot.py", line 315, in message_handle
chatgpt_telegram_bot  |     await smart_agent_handle(update, context, message=message)
chatgpt_telegram_bot  |   File "/code/bot/bot.py", line 268, in smart_agent_handle
chatgpt_telegram_bot  |     conversation_history = localagi.evaluate(
chatgpt_telegram_bot  |                            ^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  |   File "/usr/local/lib/python3.11/site-packages/localagi/localagi.py", line 504, in evaluate
chatgpt_telegram_bot  |     self.reasoning_callback(action["action"], action["detailed_reasoning"])
chatgpt_telegram_bot  |                                               ~~~~~~^^^^^^^^^^^^^^^^^^^^^^
chatgpt_telegram_bot  | KeyError: 'detailed_reasoning'
greygoo commented 5 months ago

Looked at the logs in more detail, seems it fails trieing to load a model called functions:

api-1                 | 5:34PM INF Trying to load the model 'functions' with the backend '[llama-cpp llama-ggml gpt4all llama-cpp-fallback rwkv stablediffusion piper whisper huggingface bert-embeddings /build/backend/python/transformers-musicgen/run.sh /build/backend/python/exllama/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/mamba/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/petals/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/bark/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/vllm/run.sh /build/backend/python/coqui/run.sh /build/backend/python/exllama2/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/transformers/run.sh /build/backend/python/vall-e-x/run.sh]'

Not sure if that is a request for the wrong model, or if it is just missing, will see if I can figure it out.

JackBekket commented 4 months ago

quay.io/go-skynet/local-ai:v1.18.0-ffmpeg localai/localai:v2.17.1-ffmpeg

try localai-aio-nvidia-cuda-12 docker image