Open artistlu opened 1 month ago
Try running with SUPPORT_BF16=0
e.g. SUPPORT_BF16=0 python3 main.py
. Can you let me know if that works?
Ideally we detect this automatically.
Try running with
SUPPORT_BF16=0
e.g.SUPPORT_BF16=0 python3 main.py
. Can you let me know if that works?Ideally we detect this automatically.
In order to load a local model, I have made modifications to the following two methods:
Additionally, I have also added the environment variable SUPPORT_BF16=0 when starting exo.
I am encountering the following error:
{'319cbd94-148d-4767-80af-950aa5c20-11'}}). Next partition:
Partition(node_id='319cbd94-148d-4767-80af-950aa5c20-23', start=0, end=0.07692)
Sending tensor_or_prompt to 319cbd94-148d-4767-80af-950aa5c20-23:
<|im_start|>user
What is the meaning of exo?<|im_end|>
<|im_start|>assistant
Broadcasting opaque status: request_id='cb27eced-cd1c-48d8-9ef9-0f4e680346fe'
status='{"type": "node_status", "node_id":
"319cbd94-148d-4767-80af-950aa5c20-11", "status": "start_process_prompt",
"base_shard": {"model_id": "/nasroot/models/Meta-Llama-3-8B/", "start_layer": 0,
"end_layer": 0, "n_layers": 32}, "shard": {"model_id":
"/nasroot/models/Meta-Llama-3-8B/", "start_layer": 29, "end_layer": 31,
"n_layers": 32}, "prompt": "<|im_start|>user\\nWhat is the meaning of
exo?<|im_end|>\\n<|im_start|>assistant\\n", "image_str": null,
"inference_state": null, "request_id": "cb27eced-cd1c-48d8-9ef9-0f4e680346fe"}'
Traceback (most recent call last):
File "/nasroot/code/exo_0814/exo/api/chatgpt_api.py", line 306, in
handle_post_chat_completions
await self.node.process_prompt(shard, prompt, image_str,
request_id=request_id)
File "/nasroot/code/exo_0814/exo/orchestration/standard_node.py", line 102, in
process_prompt
resp = await self._process_prompt(base_shard, prompt, image_str, request_id,
inference_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^
File "/nasroot/code/exo_0814/exo/orchestration/standard_node.py", line 137, in
_process_prompt
await self.forward_to_next_shard(shard, prompt, request_id,
image_str=image_str, inference_state=inference_state)
File "/nasroot/code/exo_0814/exo/orchestration/standard_node.py", line 280, in
forward_to_next_shard
await target_peer.send_prompt(next_shard, tensor_or_prompt,
image_str=image_str, request_id=request_id, inference_state=inference_state)
File "/nasroot/code/exo_0814/exo/networking/grpc/grpc_peer_handle.py", line
55, in send_prompt
response = await self.stub.SendPrompt(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/nasroot/miniconda3/envs/exo/lib/python3.12/site-packages/grpc/aio/_call.py",
line 318, in __await__
raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Unexpected <class 'tinygrad.device.CompileError'>: OpenCL
Compile Error
<source>:1:38: error: unknown type name '__bf16'
__kernel void E_131072_32_4(__global __bf16* data0) {
^
<source>:5:20: error: use of undeclared identifier '__bf16'
data0[alu0+1] = (__bf16)(0.0);
^
<source>:5:28: warning: double precision constant requires cl_khr_fp64, casting
to single precision
data0[alu0+1] = (__bf16)(0.0);
^
<source>:6:20: error: use of undeclared identifier '__bf16'
data0[alu0+2] = (__bf16)(0.0);
^
<source>:6:28: warning: double precision constant requires cl_khr_fp64, casting
to single precision
data0[alu0+2] = (__bf16)(0.0);
^
<source>:7:20: error: use of undeclared identifier '__bf16'
data0[alu0+3] = (__bf16)(0.0);
^
<source>:7:28: warning: double precision constant requires cl_khr_fp64, casting
to single precision
data0[alu0+3] = (__bf16)(0.0);
^
<source>:8:18: error: use of undeclared identifier '__bf16'
data0[alu0] = (__bf16)(0.0);
^
<source>:8:26: warning: double precision constant requires cl_khr_fp64, casting
to single precision
data0[alu0] = (__bf16)(0.0);
^
error: Compiler frontend failed (error code 62)
"
debug_error_string = "UNKNOWN:Error received from peer
{grpc_message:"Unexpected <class \'tinygrad.device.CompileError\'>: OpenCL
Compile Error\n\n<source>:1:38: error: unknown type name \'__bf16\'\n__kernel
void E_131072_32_4(__global __bf16* data0) {\n
^\n\n<source>:5:20: error: use of undeclared identifier \'__bf16\'\n
data0[alu0+1] = (__bf16)(0.0);\n ^\n\n<source>:5:28: warning:
double precision constant requires cl_khr_fp64, casting to single precision\n
data0[alu0+1] = (__bf16)(0.0);\n ^\n\n<source>:6:20:
error: use of undeclared identifier \'__bf16\'\n data0[alu0+2] =
(__bf16)(0.0);\n ^\n\n<source>:6:28: warning: double precision
constant requires cl_khr_fp64, casting to single precision\n data0[alu0+2] =
(__bf16)(0.0);\n ^\n\n<source>:7:20: error: use of
undeclared identifier \'__bf16\'\n data0[alu0+3] = (__bf16)(0.0);\n
^\n\n<source>:7:28: warning: double precision constant requires cl_khr_fp64,
casting to single precision\n data0[alu0+3] = (__bf16)(0.0);\n
^\n\n<source>:8:18: error: use of undeclared identifier \'__bf16\'\n
data0[alu0] = (__bf16)(0.0);\n ^\n\n<source>:8:26: warning:
double precision constant requires cl_khr_fp64, casting to single precision\n
data0[alu0] = (__bf16)(0.0);\n ^\n\nerror: Compiler
frontend failed (error code 62)\n", grpc_status:2,
created_time:"2024-08-14T10:50:09.810393599+08:00"}"
>
Received SendOpaqueStatus request:
request_id='cb27eced-cd1c-48d8-9ef9-0f4e680346fe' status='{"type":
"node_status", "node_id": "319cbd94-148d-4767-80af-950aa5c20-23", "status":
"start_process_prompt", "base_shard": {"model_id":
"/nasroot/models/Meta-Llama-3-8B/", "start_layer": 0, "end_layer": 1,
"n_layers": 32}, "shard": {"model_id": "/nasroot/models/Meta-Llama-3-8B/",
"start_layer": 0, "end_layer": 1, "n_layers": 32}, "prompt":
"<|im_start|>user\\nWhat is the meaning of
exo?<|im_end|>\\n<|im_start|>assistant\\n", "image_str": "", "inference_state":
null, "request_id": "cb27eced-cd1c-48d8-9ef9-0f4e680346fe"}'
Preemptively starting download for
Shard(model_id='/nasroot/models/Meta-Llama-3-8B/', start_layer=29, end_layer=31,
n_layers=32)
Preemptively starting download for
Shard(model_id='/nasroot/models/Meta-Llama-3-8B/', start_layer=29, end_layer=31,
n_layers=32)
I'm not sure if this is a tinygrad issue. Could updating to the latest tinygrad version solve this problem? My device is not connected to the internet, so all operations are copied and executed on the node. @AlexCheema
I'm encountering an issue with my Mali GPU. When I try to inference, I get the following error: