pytorch / torchchat

Run PyTorch LLMs locally on servers, desktop and mobile
BSD 3-Clause "New" or "Revised" License
3.13k stars 196 forks source link

[LAUNCH BLOCKER?] https://github.com/pytorch/ao/issues/260 libcudart cannot be loaded, but why? We're exporting executorch model #843

Closed mikekgfb closed 4 months ago

mikekgfb commented 4 months ago

https://github.com/pytorch/ao/issues/260 libcudart cannot be loaded, but why? We're exporting executorch model



https://github.com/pytorch/torchchat/actions/runs/9166937828/job/25203278945?pr=842

ImportError: libcudart.so.12: cannot open shared object file: No such file or directory

******** ET: a8w4dq INT4 group-wise quantized *******

Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
NumExpr defaulting to 8 threads.
PyTorch version 2.4.0.dev20240507+cpu available.
Using device=cpu

Loading model...
Time to load model: 0.02 seconds
Quantizing the model with: {'linear:a8w4dq': {'groupsize': 32}}
Downloading builder script: 0%| | 0.00/5.67k [00:00<?, ?B/s]
Downloading builder script: 100%|██████████| 5.67k/5.67k [00:00<00:00, 27.6MB/s]
Traceback (most recent call last):
Time to quantize model: 3.01 seconds
File "/home/runner/work/torchchat/torchchat/export.py", line 119, in
main(args)
File "/home/runner/work/torchchat/torchchat/export.py", line 70, in main
model = _initialize_model(
File "/home/runner/work/torchchat/torchchat/build/builder.py", line 433, in initialize_model
quantize_model(model, builder_args.device, quantize, tokenizer)
File "/home/runner/work/torchchat/torchchat/quantize.py", line 58, in quantize_model
).quantized_model()
File "/home/runner/work/torchchat/torchchat/quantize.py", line 679, in quantized_model
return self.quantize(self.model)
File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/runner/work/torchchat/torchchat/quantize.py", line 629, in quantize
from torchao.quantization.quant_primitives import (
File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/torchao/init.py", line 14, in
from . import _C
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory
Error: Process completed with exit code 1.
mikekgfb commented 4 months ago

Mitigated.

Asked for graceful handling of cuda presence, not draconian choice between cuda or no cuda, since we have different operating modes that may call for absence (ET, or at least no presence) and presence (AOTI-CUDA) of CUDA