exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
GNU General Public License v3.0
16.36k stars 870 forks source link

Android device loading model reports CLANG error #152

Open fangxuezheng opened 3 months ago

fangxuezheng commented 3 months ago

Phone model: xiaomi14pro Memory: 16GB After downloading the llama3 8B model, a CLANG compilation error was reported during loading. The specific logs are as follows:

Traceback (most recent call last): File "/root/exo_master/exo/api/chatgpt_api.py", line 311, in handle_post_chat_completions await self.node.process_prompt(shard, prompt, image_str, request_id=request_id) File "/root/exo_master/exo/orchestration/standard_node.py", line 102, in process_prompt resp = await self._process_prompt(base_shard, prompt, image_str, request_id, inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/exo_master/exo/orchestration/standard_node.py", line 140, in _process_prompt result, inference_state, is_finished = await self.inference_engine.infer_prompt(request_id, shard, prompt, image_str, inference_state=inference_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/exo_master/exo/inference/tinygrad/inference.py", line 61, in infer_prompt await self.ensure_shard(shard) File "/root/exo_master/exo/inference/tinygrad/inference.py", line 95, in ensure_shard self.model = build_transformer(model_path, shard, model_size="8B" if "8b" in shard.model_id.lower() else "70B") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/exo_master/exo/inference/tinygrad/inference.py", line 52, in build_transformer load_state_dict(model, weights, strict=False, consume=False) # consume=True ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict else: v.replace(state_dict[k].to(v.device)).realize() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/tensor.py", line 3186, in _wrapper ret = fn(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/tensor.py", line 204, in realize run_schedule(self.schedule_with_vars(lst), do_update_stats=do_update_stats) File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 220, in run_schedule for ei in lower_schedule(schedule): File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 213, in lower_schedule raise e File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 207, in lower_schedule try: yield lower_schedule_item(si) ^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 191, in lower_schedule_item runner = get_runner(si.outputs[0].device, si.ast) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 161, in get_runner method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/engine/realize.py", line 83, in init self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/device.py", line 183, in compile_cached lib = self.compile(src) ^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile subprocess.check_output(['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', File "/root/miniconda3/envs/exo/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/exo/lib/python3.12/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['clang', '-include', 'tgmath.h', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-', '-o', '/tmp/tmpg9zetteq']' returned non-zero exit status 1.

mct2611 commented 3 months ago

Hi @fangxuezheng ,i had the same problem and have u solved it?

the-praxs commented 3 months ago

Looks like clang is not installed in the device. Try installing clang and then please report the results.

ji-cryptocafe commented 2 months ago

@fangxuezheng Did you find a solution to this? i am running into a very similar problem with CLANG.

clang version 18.1.8
Target: aarch64-unknown-linux-android24
Thread model: posix
InstalledDir: /data/data/com.termux/files/usr/bin
Traceback (most recent call last):
  File "/data/data/com.termux/files/home/exo/exo/api/chatgpt_api.py", line 252, in handle_post_chat_completions
    await self.node.process_prompt(shard, prompt, image_str, request_id=request_id)
  File "/data/data/com.termux/files/home/exo/exo/orchestration/standard_node.py", line 98, in process_prompt
    resp = await self._process_prompt(base_shard, prompt, image_str, request_id, inference_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/home/exo/exo/orchestration/standard_node.py", line 134, in _process_prompt
    result, inference_state, is_finished = await self.inference_engine.infer_prompt(request_id, shard, prompt, image_str, inference_state=inference_state)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/home/exo/exo/inference/tinygrad/inference.py", line 62, in infer_prompt
    await self.ensure_shard(shard)
  File "/data/data/com.termux/files/home/exo/exo/inference/tinygrad/inference.py", line 100, in ensure_shard
    self.model = await asyncio.get_event_loop().run_in_executor(self.executor, build_transformer, model_path, shard, "8B" if "8b" in shard.model_id.lower() else "70B")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/home/exo/exo/inference/tinygrad/inference.py", line 51, in build_transformer
    load_state_dict(model, weights, strict=False, consume=False)  # consume=True
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/nn/state.py", line 129, in load_state_dict
    else: v.replace(state_dict[k].to(v.device)).realize()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/tensor.py", line 3256, in _wrapper
    ret = fn(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/tensor.py", line 204, in realize
    run_schedule(*self.schedule_with_vars(*lst), do_update_stats=do_update_stats)
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 221, in run_schedule
    for ei in lower_schedule(schedule):
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 214, in lower_schedule
    raise e
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 208, in lower_schedule
    try: yield lower_schedule_item(si)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 192, in lower_schedule_item
    runner = get_runner(si.outputs[0].device, si.ast)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 161, in get_runner
    method_cache[ckey] = method_cache[bkey] = ret = CompiledRunner(replace(prg, dname=dname))
                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/engine/realize.py", line 83, in __init__
    self.lib:bytes = precompiled if precompiled is not None else Device[p.dname].compiler.compile_cached(p.src)
                                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/device.py", line 182, in compile_cached
    lib = self.compile(src)
          ^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/tinygrad/runtime/ops_clang.py", line 10, in compile
    subprocess.check_output(['clang', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-ffreestanding', '-nostdlib',
  File "/data/data/com.termux/files/usr/lib/python3.11/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/data/com.termux/files/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['clang', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-ffreestanding', '-nostdlib', '-', '-o', '/data/data/com.termux/files/usr/tmp/tmpyj1mtlk1']' returned non-zero exit status 1.
dtnewman commented 1 month ago

i was playing around with this earlier and i don't have a solution, but I am pretty sure this has to do with running it on a consumer CPU that doesn't have bfloat16 support (my CPU is an Intel i7-1260P for reference). I think this would probably work on the latest gen of intel CPUs but I don't have access to one to test.

helpau commented 1 month ago

Same behavior at Xeon E5-2670 v3(WSL 2, Ubuntu 22.04.4) and Core i7-11700(WSL 1, Ubuntu 24.04.1). clang installed on both