exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
GNU General Public License v3.0
6.56k stars 342 forks source link

cannot run in WSLs with CPUs only. #147

Open mct2611 opened 1 month ago

mct2611 commented 1 month ago

Hi, i want to know if it is possible to run the exo between WSLs with cpus only?

I have run the command CLANG=1 python main.py between multi WSLs, however the output shows 0 TFLOPS.

In the tinychat page, i choose the Llama 3 8B model, after downloading the model, loaded weights and then it was stuck. Seems the CPU isn't working. exo_running

So if the exo supports running with cpus only?

AlexCheema commented 1 month ago

The FLOPS showing 0 is fine, that's just a visual bug.

Getting stuck is a bug. Need to look into why CLANG isn't working for inference on tinygrad.

It would be helpful if you could run with DEBUG=2 and paste the output here.

mct2611 commented 4 weeks ago

@AlexCheema Hi, AlexCheema, i run with DEBUG=2 and here is the output, which seems two errors occurred:

Trying AutoTokenizer for llama3-8b-sfr Traceback (most recent call last): File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status response.raise_for_status() File "/Projects/venv/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/llama3-8b-sfr/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/Projects/venv/lib/python3.12/site-packages/transformers/utils/hub.py", line 399, in cached_file resolved_file = hf_hub_download( ^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download return _hf_hub_download_to_cache_dir( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir _raise_on_head_call_error(head_call_error, force_download, local_files_only) File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1823, in _raise_on_head_call_error raise head_call_error File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata r = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper response = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper hf_raise_for_status(response) File "/Projects/venv/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py", line 352, in hf_raise_for_status raise RepositoryNotFoundError(message, response) from e huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-66bf2532-75d1fa8e5585667629e4e209;82788632-8c92-4356-9c70-f5e70ab4aa63)

Repository Not Found for url: https://huggingface.co/llama3-8b-sfr/resolve/main/tokenizer_config.json. Please make sure you specified the correct repo_id and repo_type. If you are trying to access a private or gated repo, make sure you are authenticated. Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/Projects/exo-main/exo/api/chatgpt_api.py", line 57, in resolve_tokenizer return AutoTokenizer.from_pretrained(model_id) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 817, in from_pretrained tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 649, in get_tokenizer_config resolved_config_file = cached_file( ^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/transformers/utils/hub.py", line 422, in cached_file raise EnvironmentError( OSError: llama3-8b-sfr is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' If this is a private repository, make sure to pass a token having permission to this repo either by logging in with huggingface-cli login or by passing token=<your_token>

Failed to load tokenizer for llama3-8b-sfr. Falling back to tinygrad tokenizer Trying tinygrad tokenizer for llama3-8b-sfr

Sending prompt from ChatGPT api request_id='0e6657ae-d2aa-4617-98fb-ca55a3f3220b' shard=Shard(model_id='llama3-8b-sfr', start_layer=0, end_layer=0, n_layers=32) prompt='<|start_header_id|>user<|end_header_id|>\nhello<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n' [0e6657ae-d2aa-4617-98fb-ca55a3f3220b] process prompt: base_shard=Shard(model_id='llama3-8b-sfr', start_layer=0, end_layer=0, n_layers=32) shard=Shard(model_id='llama3-8b-sfr', start_layer=0, end_layer=24, n_layers=32) prompt='<|start_header_id|>user<|end_header_id|>\nhello<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n' opened device NPY from pid:5787 opened device DISK:/home/ubuntu/.cache/tinygrad/downloads/llama3-8b-sfr/model-00002-of-00004.safetensors from pid:5787 DISK:/h 1 empty 4999802720 dtypes.uchar arg 1 mem 0.00 GB DISK:/h 2 view 8 @ 0 arg 2 mem 0.00 GB opened device CLANG from pid:5787 CLANG 3 copy 8, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 36.40us/ 0.04ms ( 0.00 GFLOPS, 0.00 GB/s) DISK:/h 4 view 12120 @ 8 arg 2 mem 0.00 GB CLANG 5 copy 12120, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 12.20us/ 0.05ms ( 0.00 GFLOPS, 0.99 GB/s) opened device DISK:/home/ubuntu/.cache/tinygrad/downloads/llama3-8b-sfr/model-00001-of-00004.safetensors from pid:5787 DISK:/h 6 empty 4976698672 dtypes.uchar arg 1 mem 0.00 GB DISK:/h 7 view 8 @ 0 arg 2 mem 0.00 GB CLANG 8 copy 8, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 34.70us/ 0.08ms ( 0.00 GFLOPS, 0.00 GB/s) DISK:/h 9 view 9512 @ 8 arg 2 mem 0.00 GB CLANG 10 copy 9512, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 4.10us/ 0.09ms ( 0.00 GFLOPS, 2.32 GB/s) opened device DISK:/home/ubuntu/.cache/tinygrad/downloads/llama3-8b-sfr/model-00003-of-00004.safetensors from pid:5787 DISK:/h 11 empty 4915916176 dtypes.uchar arg 1 mem 0.00 GB DISK:/h 12 view 8 @ 0 arg 2 mem 0.00 GB CLANG 13 copy 8, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 15.90us/ 0.10ms ( 0.00 GFLOPS, 0.00 GB/s) DISK:/h 14 view 11656 @ 8 arg 2 mem 0.00 GB CLANG 15 copy 11656, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 12.90us/ 0.12ms ( 0.00 GFLOPS, 0.90 GB/s) opened device DISK:/home/ubuntu/.cache/tinygrad/downloads/llama3-8b-sfr/model-00004-of-00004.safetensors from pid:5787 DISK:/h 16 empty 1168138808 dtypes.uchar arg 1 mem 0.00 GB DISK:/h 17 view 8 @ 0 arg 2 mem 0.00 GB CLANG 18 copy 8, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 17.00us/ 0.13ms ( 0.00 GFLOPS, 0.00 GB/s) DISK:/h 19 view 560 @ 8 arg 2 mem 0.00 GB CLANG 20 copy 560, CLANG <- DISK:/h arg 2 mem 0.00 GB tm 3.10us/ 0.14ms ( 0.00 GFLOPS, 0.18 GB/s) 0%| | 0/229 [00:00<?, ?it/s] DISK:/h 21 view 33554432 @ 1444963632 arg 2 mem 0.00 GB CLANG 22 copy 33.55M, CLANG <- DISK:/h arg 2 mem 0.03 GB tm 9363.80us/ 9.50ms ( 0.00 GFLOPS, 3.58 GB/s) error lowering MetaOps.SINK tensor operations: [cast - exo.inference.tinygrad.models.llama:224::fix_bf16] loaded weights in 34.03 ms, 0.03 GB loaded at 0.99 GB/s Traceback (most recent call last): File "/Projects/exo-main/exo/api/chatgpt_api.py", line 184, in handle_post_chat_completions await self.node.process_prompt(shard, prompt, request_id=request_id) File "/Projects/exo-main/exo/orchestration/standard_node.py", line 63, in process_prompt resp = await self._process_prompt(base_shard, prompt, request_id, inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/exo-main/exo/orchestration/standard_node.py", line 82, in _process_prompt result, inference_state, is_finished = await self.inference_engine.infer_prompt(shard, prompt, inference_state=inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/exo-main/exo/inference/tinygrad/inference.py", line 147, in infer_prompt await self.ensure_shard(shard) File "/Projects/exo-main/exo/inference/tinygrad/inference.py", line 208, in ensure_shard model = build_transformer(model_path, shard=shard, model_size=size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/exo-main/exo/inference/tinygrad/inference.py", line 124, in build_transformer load_state_dict(model, weights, strict=False, consume=True) File "/Projects/venv/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict else: v.replace(state_dict[k].to(v.device)).realize() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/tensor.py", line 3123, in _wrapper ret = fn(*args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/tensor.py", line 203, in realize run_schedule(self.schedule_with_vars(lst), do_update_stats=do_update_stats) File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 222, in run_schedule for ei in lower_schedule(schedule): File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 215, in lower_schedule raise e File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 209, in lower_schedule try: yield lower_schedule_item(si) ^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 193, in lower_schedule_item runner = get_runner(si.outputs[0].device, si.ast) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 164, in get_runner method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 82, in init self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/device.py", line 177, in compile_cached lib = self.compile(src) ^^^^^^^^^^^^^^^^^ File "/Projects/venv/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile subprocess.check_output(['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', File "/usr/local/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/subprocess.py", line 548, in run with Popen(popenargs, kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/local/lib/python3.12/subprocess.py", line 1955, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'clang' Starting with the following peers: [<exo.networking.grpc.grpc_peer_handle.GRPCPeerHandle object at 0xffe6a4604500>] Connecting to new peers... Already connected to bd49d976-503a-4c39-a369-0e9e014e9306: True Collecting topology max_depth=4 visited={'bd49d976-503a-4c39-a369-0e9e014e9306'} Already visited bd49d976-503a-4c39-a369-0e9e014e9306. Skipping... Topology collection task executed.

mct2611 commented 3 weeks ago

And i have tried to run the exo on Android phones with the termux, the same issue occurred:

] loaded weights in 122.44 ms, 0.03 GB loaded at 0.27 GB/s Traceback (most recent call last): File "/projects/exo/exo/api/chatgpt_api.py", line 306, in handle_post_chat_completions await self.node.process_prompt(shard, prompt, image_str, request_id=request_id) File "/projects/exo/exo/orchestration/standard_node.py", line 102, in process_prompt resp = await self._process_prompt(base_shard, prompt, image_str, request_id, inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/orchestration/standard_node.py", line 140, in _process_prompt result, inference_state, is_finished = await self.inference_engine.infer_prompt(request_id, shard, prompt, image_str, inference_state=inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 60, in infer_prompt await self.ensure_shard(shard) File "/projects/exo/exo/inference/tinygrad/inference.py", line 96, in ensure_shard self.model = build_transformer(model_path, shard, model_size="8B" if "8b" in shard.model_id.lower() else "70B") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 51, in build_transformer load_state_dict(model, weights, strict=False, consume=False) # consume=True ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict else: v.replace(state_dict[k].to(v.device)).realize() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 3166, in _wrapper ret = fn(*args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 203, in realize run_schedule(self.schedule_with_vars(lst), do_update_stats=do_update_stats) File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 223, in run_schedule for ei in lower_schedule(schedule): File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 216, in lower_schedule raise e File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 210, in lower_schedule try: yield lower_schedule_item(si) ^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 194, in lower_schedule_item runner = get_runner(si.outputs[0].device, si.ast) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 164, in get_runner method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 82, in init self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/device.py", line 183, in compile_cached lib = self.compile(src) ^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile subprocess.check_output(['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', File "/usr/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/subprocess.py", line 548, in run with Popen(popenargs, kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.12/subprocess.py", line 1955, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'clang'

AlexCheema commented 3 weeks ago

And i have tried to run the exo on Android phones with the termux, the same issue occurred:

] loaded weights in 122.44 ms, 0.03 GB loaded at 0.27 GB/s Traceback (most recent call last): File "/projects/exo/exo/api/chatgpt_api.py", line 306, in handle_post_chat_completions await self.node.process_prompt(shard, prompt, image_str, request_id=request_id) File "/projects/exo/exo/orchestration/standard_node.py", line 102, in process_prompt resp = await self._process_prompt(base_shard, prompt, image_str, request_id, inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/orchestration/standard_node.py", line 140, in _process_prompt result, inference_state, is_finished = await self.inference_engine.infer_prompt(request_id, shard, prompt, image_str, inference_state=inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 60, in infer_prompt await self.ensure_shard(shard) File "/projects/exo/exo/inference/tinygrad/inference.py", line 96, in ensure_shard self.model = build_transformer(model_path, shard, model_size="8B" if "8b" in shard.model_id.lower() else "70B") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 51, in build_transformer load_state_dict(model, weights, strict=False, consume=False) # consume=True ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict else: v.replace(state_dict[k].to(v.device)).realize() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 3166, in _wrapper ret = fn(*args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 203, in realize run_schedule(self.schedule_with_vars(lst), do_update_stats=do_update_stats) File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 223, in run_schedule for ei in lower_schedule(schedule): File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 216, in lower_schedule raise e File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 210, in lower_schedule try: yield lower_schedule_item(si) ^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 194, in lower_schedule_item runner = get_runner(si.outputs[0].device, si.ast) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 164, in get_runner method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 82, in init self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/device.py", line 183, in compile_cached lib = self.compile(src) ^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile subprocess.check_output(['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', File "/usr/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/subprocess.py", line 548, in run with Popen(popenargs, kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.12/subprocess.py", line 1955, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'clang'

This one is because you don’t have clang installed.

The other one I’m looking into now. Thanks for the detailed bug report.

mct2611 commented 3 weeks ago

Hi AlexCheema @AlexCheema thank you for your comments, i have installed the clang and run the exe again, then the compile issue occurred:

loaded weights in 2063.49 ms, 0.03 GB loaded at 0.02 GB/s Traceback (most recent call last): File "/projects/exo/exo/api/chatgpt_api.py", line 306, in handle_post_chat_completions await self.node.process_prompt(shard, prompt, image_str, request_id=request_id) File "/projects/exo/exo/orchestration/standard_node.py", line 102, in process_prompt resp = await self._process_prompt(base_shard, prompt, image_str, request_id, inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/orchestration/standard_node.py", line 140, in _process_prompt result, inference_state, is_finished = await self.inference_engine.infer_prompt(request_id, shard, prompt, image_str, inference_state=inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 60, in infer_prompt await self.ensure_shard(shard) File "/projects/exo/exo/inference/tinygrad/inference.py", line 96, in ensure_shard self.model = build_transformer(model_path, shard, model_size="8B" if "8b" in shard.model_id.lower() else "70B") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/projects/exo/exo/inference/tinygrad/inference.py", line 51, in build_transformer load_state_dict(model, weights, strict=False, consume=False) # consume=True ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict else: v.replace(state_dict[k].to(v.device)).realize() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 3166, in _wrapper ret = fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/tensor.py", line 203, in realize run_schedule(self.schedule_with_vars(lst), do_update_stats=do_update_stats) File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 223, in run_schedule for ei in lower_schedule(schedule): File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 216, in lower_schedule raise e File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 210, in lower_schedule try: yield lower_schedule_item(si) ^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 194, in lower_schedule_item runner = get_runner(si.outputs[0].device, si.ast) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 164, in get_runner method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 82, in init self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/device.py", line 183, in compile_cached lib = self.compile(src) ^^^^^^^^^^^^^^^^^ File "/root/.local/share/virtualenvs/projects-4jZlt7HF/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile subprocess.check_output(['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', File "/usr/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', '-o', '/tmp/tmp2bcpjcd2']' returned non-zero exit status 1.

And between WSLs, i also installed the clang and run again, the same compile issue occurred.