mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
24.42k stars 1.87k forks source link

LOCALAI_LOG_LEVEL seemingly does nothing #2079

Closed stewe93 closed 6 months ago

stewe93 commented 6 months ago

LocalAI version:

localai/localai:v2.12.4-aio-gpu-intel-f16 Environment, CPU architecture, OS, and Version:

Unraid server with Docker Describe the bug

As of my understanding, the LOCALAI_LOG_LEVEL=debug environment variable should set the logging level to debug but in my case it does nothing. On the other hand DEBUG=true does it To Reproduce

On Unraid, start a new container with the environment variable set, the generated command:

docker run
  -d
  --name='localai'
  --net='bridge'
  -e TZ="Europe/Budapest"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="MilleniumFalcon"
  -e HOST_CONTAINERNAME="localai"
  -e 'models-path'='/build/models'
  -e 'LOCALAI_LOG_LEVEL'='debug'
  -l net.unraid.docker.managed=dockerman
  -p '8125:8080/tcp'
  -v '/mnt/user/appdata/localai/models':'/build/models':'rw'
  --device='/dev/dri'
  --restart unless-stopped 'localai/localai:v2.12.4-aio-gpu-intel-f16'

Expected behavior

I would like to see more logs than without it. Logs

Additional context

localai-bot commented 6 months ago

Hello stewe93, thank you for reporting the issue with LocalAI. I'm a bot auditing the tickets opened in this Github project, an experiment of my owner (@mudler). From your description, it seems that the LOCALAI_LOG_LEVEL=debug doesn't seem to set the logging level to debug as expected. You correctly set the environment variable while running the Docker container and expected more logs to be generated.

To provide a better solution for this issue, please provide the logs while running LocalAI with DEBUG=true to compare and help us understand the problem better. Also, please make sure to remove the initial "Thanks for helping us" comment from your next reply as it will be hidden when you post.

Looking forward to hearing back from you.

Regards, LocalAI Bot (an experiment of @mudler)

stewe93 commented 6 months ago

Since my report I changed the image to:

localai/localai:v2.12.4-ffmpeg-core

but the problem still exists.

A run with DEBUG=false but with LOCALAI_LOG_LEVEL=debug:

docker run
  -d
  --name='localai'
  --net='bridge'
  -e TZ="Europe/Budapest"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="MilleniumFalcon"
  -e HOST_CONTAINERNAME="localai"
  -e 'models-path'='/build/models'
  -e 'LOCALAI_LOG_LEVEL'='debug'
  -e 'DEBUG'='false'
  -l net.unraid.docker.managed=dockerman
  -p '8125:8080/tcp'
  -v '/mnt/user/appdata/localai/models':'/build/models':'rw'
  --device='/dev/dri'
  --restart unless-stopped 'localai/localai:v2.12.4-ffmpeg-core' hermes-2-pro-mistral

Logs:

5:00PM INF Starting LocalAI using 4 threads, with models path: /build/models
5:00PM INF LocalAI version: v2.12.4 (0004ec8be3ca150ce6d8b79f2991bfe3a9dc65ad)
5:00PM INF Preloading models from /build/models
5:00PM INF core/startup process completed!
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
model name      : Intel(R) Core(TM) i3-10100 CPU @ 3.60GHz
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@

  Model name: hermes-2-pro-mistral                                            

  curl http://localhost:8080/v1/chat/completions -H "Content-Type:            
  application/json" -d '{ "model": "hermes-2-pro-mistral", "messages": [{"role":
  "user", "content": "How are you doing?", "temperature": 0.1}] }'            

  Model name: hermes2                                                         

  Model name: home                                                            

  Model name: luna                                                            

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.52.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ........... 181  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................. 1 │ 
 └───────────────────────────────────────────────────┘ 

A run with DEBUG=true and LOCALAI_LOG_LEVEL=debug:

docker run
  -d
  --name='localai'
  --net='bridge'
  -e TZ="Europe/Budapest"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="MilleniumFalcon"
  -e HOST_CONTAINERNAME="localai"
  -e 'models-path'='/build/models'
  -e 'LOCALAI_LOG_LEVEL'='debug'
  -e 'DEBUG'='true'
  -l net.unraid.docker.managed=dockerman
  -p '8125:8080/tcp'
  -v '/mnt/user/appdata/localai/models':'/build/models':'rw'
  --device='/dev/dri'
  --restart unless-stopped 'localai/localai:v2.12.4-ffmpeg-core' hermes-2-pro-mistral

Logs:

5:04PM INF Starting LocalAI using 4 threads, with models path: /build/models
5:04PM INF LocalAI version: v2.12.4 (0004ec8be3ca150ce6d8b79f2991bfe3a9dc65ad)
5:04PM DBG [startup] resolved embedded model: hermes-2-pro-mistral
5:04PM INF Preloading models from /build/models
5:04PM DBG Model: hermes-2-pro-mistral (config: {PredictionOptions:{Model:5c7cd056ecf9a4bb5b527410b97f48cb Language: N:0 TopP:0xc00023ad38 TopK:0xc00023ad40 Temperature:0xc00023ad48 Maxtokens:0xc00023ad50 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc00023ad78 TypicalP:0xc00023ad70 Seed:0xc00023ad90 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:hermes-2-pro-mistral F16:0xc00023ad18 Threads:0xc00023ad28 Debug:0xc00023ad88 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:{{.Input -}}
<|im_start|>assistant
 ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
{{- if .FunctionCall }}<tool_call>{{end}}
{{- if eq .RoleName "tool" }}<tool_result>{{end }}
{{- if .Content}}
{{.Content}}
{{- end }}
{{- if .FunctionCall}}{{toJson .FunctionCall}}{{end }}
{{- if .FunctionCall }}</tool_call>{{end }}
{{- if eq .RoleName "tool" }}</tool_result>{{end }}
<|im_end|>
 Completion:{{.Input}}
 Edit: Functions:<|im_start|>system
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
<tools>
{{range .Functions}}
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
{{end}}
</tools>
Use the following pydantic model json schema for each tool call you will make:
{'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}
For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{'arguments': <args-dict>, 'name': <function-name>}
</tool_call>
<|im_end|>
{{.Input -}}
<|im_start|>assistant
<tool_call>
} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc00023ad68 MirostatTAU:0xc00023ad60 Mirostat:0xc00023ad58 NGPULayers:0xc00023ad80 MMap:0xc00023abf8 MMlock:0xc00023ad89 LowVRAM:0xc00023ad89 Grammar: StopWords:[<|im_end|> <dummy32000> 
</tool_call> 

] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc00023ad08 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "hermes-2-pro-mistral",
    "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
}'
})
5:04PM DBG Model: hermes2 (config: {PredictionOptions:{Model:5c7cd056ecf9a4bb5b527410b97f48cb Language: N:0 TopP:0xc00023af00 TopK:0xc00023af08 Temperature:0xc00023af10 Maxtokens:0xc00023af18 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc00023af40 TypicalP:0xc00023af38 Seed:0xc00023af58 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:hermes2 F16:0xc00023aee0 Threads:0xc00023aef0 Debug:0xc00023af50 Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc00023af30 MirostatTAU:0xc00023af28 Mirostat:0xc00023af20 NGPULayers:0xc00023af48 MMap:0xc00023aec8 MMlock:0xc00023af51 LowVRAM:0xc00023af51 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc00023aed0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:})
5:04PM DBG Model: home (config: {PredictionOptions:{Model:408ea800cf09e378e24ef1545227867c Language: N:0 TopP:0xc00023b0c0 TopK:0xc00023b0c8 Temperature:0xc00023b0d0 Maxtokens:0xc00023b0d8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc00023b100 TypicalP:0xc00023b0f8 Seed:0xc00023b118 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:home F16:0xc00023b0a0 Threads:0xc00023b0b0 Debug:0xc00023b110 Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc00023b0f0 MirostatTAU:0xc00023b0e8 Mirostat:0xc00023b0e0 NGPULayers:0xc00023b108 MMap:0xc00023b088 MMlock:0xc00023b111 LowVRAM:0xc00023b111 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc00023b090 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:})
5:04PM DBG Model: luna (config: {PredictionOptions:{Model:7288e6ddace3da44616de88f7d85d33c Language: N:0 TopP:0xc00023b280 TopK:0xc00023b288 Temperature:0xc00023b290 Maxtokens:0xc00023b298 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc00023b2c0 TypicalP:0xc00023b2b8 Seed:0xc00023b2d8 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:luna F16:0xc00023b260 Threads:0xc00023b270 Debug:0xc00023b2d0 Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc00023b2b0 MirostatTAU:0xc00023b2a8 Mirostat:0xc00023b2a0 NGPULayers:0xc00023b2c8 MMap:0xc00023b248 MMlock:0xc00023b2d1 LowVRAM:0xc00023b2d1 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc00023b250 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:})
5:04PM DBG Extracting backend assets files to /tmp/localai/backend_data
5:04PM INF core/startup process completed!
5:04PM DBG No configuration file found at /tmp/localai/upload/uploadedFiles.json
5:04PM DBG No configuration file found at /tmp/localai/config/assistants.json
5:04PM DBG No configuration file found at /tmp/localai/config/assistantsFile.json
CPU info:
model name      : Intel(R) Core(TM) i3-10100 CPU @ 3.60GHz
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@

  Model name: hermes-2-pro-mistral                                            

  curl http://localhost:8080/v1/chat/completions -H "Content-Type:            
  application/json" -d '{ "model": "hermes-2-pro-mistral", "messages": [{"role":
  "user", "content": "How are you doing?", "temperature": 0.1}] }'            

  Model name: hermes2                                                         

  Model name: home                                                            

  Model name: luna                                                            

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.52.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ........... 181  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................. 1 │ 
 └───────────────────────────────────────────────────┘ 
cryptk commented 6 months ago

This was fixed in #2036 and will go out in the next version.

stewe93 commented 6 months ago

Tested with localai/localai:master-sycl-f16-ffmpeg now, works like a charm, thank You very much!