Closed xiezhq-hermann closed 1 year ago
I tried.
0af9051 * main Merge branch 'xiezhq-hermann/main'
|\
18482fa | * xiezhq-hermann/main update CPU and m1/m2
980ca74 | * merge latest main
| |\
332849a | * | enable CPU and M1/M2 platform
fea8321 * | | origin/main update version
896e1e0 * | | Update README.md
50ae8ad * | | Delete README.md
9d888e5 * | Move apps into flexgen package (#70)
Seems something wrong.
ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.flex_opt --model facebook/opt-1.3b
Exception in thread Thread-1 (copy_worker_func):
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-2 (copy_worker_func):
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-3 (copy_worker_func):
model size: 2.443 GB, cache size: 0.398 GB, hidden size (prefill): 0.008 GB
init weight...
Exception in thread Thread-4 (copy_worker_func):
Traceback (most recent call last):
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
self.run()
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
self.run()
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
self.run()
self._target(*self._args, **self._kwargs)
File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
self.run()
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
self._target(*self._args, **self._kwargs)
File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
return _run_code(code, main_globals, None,
File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
self._target(*self._args, **self._kwargs)
torch.cuda.set_device(device_id)
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
torch.cuda.set_device(device_id)
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
torch.cuda.set_device(device_id)
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
exec(code, run_globals)
torch._C._cuda_setDevice(device)
torch.cuda.set_device(device_id)
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
torch._C._cuda_setDevice(device)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1334, in <module>
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
torch._C._cuda_setDevice(device)
torch._C._cuda_setDevice(device)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
run_flexgen(args)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1218, in run_flexgen
model = OptLM(opt_config, env, args.path, policy)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 617, in __init__
self.load_weight_stream = torch.cuda.Stream()
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/streams.py", line 34, in __new__
return super(Stream, cls).__new__(cls, priority=priority, **kwargs)
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
Exception ignored in: <function OptLM.__del__ at 0x11b250040>
Traceback (most recent call last):
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1148, in __del__
self.delete_all_weights()
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 803, in delete_all_weights
self.delete_weight(j, 0)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 669, in delete_weight
for x in self.weight_home[j].pop():
AttributeError: 'OptLM' object has no attribute 'weight_home'
ppa-hirano:FlexGen hirano-s$
@HIRANO-Satoshi Did you just run the code on your Mac machine? If so, you should add --platform "mps:0"
into the command. If you tested it on a machine with NVIDIA GPU, can you try the latest commit (I merged it for you) and rebuild FlexGen? I am not sure what the codes you just ran are.
A quick work around.
def delete_weight(self, j, k):
if k == 0 and getattr(self, 'weight_home', None):
for x in self.weight_home[j].pop():
But another one.
File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
torch.cuda.set_device(device_id)
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
run_flexgen(args)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1218, in run_flexgen
torch._C._cuda_setDevice(device)
model = OptLM(opt_config, env, args.path, policy)
File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 617, in __init__
torch.cuda.set_device(device_id)
torch._C._cuda_setDevice(device)
torch._C._cuda_setDevice(device)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
self.load_weight_stream = torch.cuda.Stream()
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/streams.py", line 34, in __new__
torch._C._cuda_setDevice(device)
return super(Stream, cls).__new__(cls, priority=priority, **kwargs)
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
I don't have NVIDIA. With --platform cpu, it start working. Thanks much!
Maybe apps/completion.py needs the --platform option.
ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.apps.completion --model facebook/opt-1.3b
...
File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
torch._C._cuda_setDevice(device)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
'''
ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.apps.completion --model facebook/opt-1.3b --platform cpu usage: completion.py [-h] [--model MODEL] [--path PATH] [--offload-dir OFFLOAD_DIR] [--percent PERCENT [PERCENT ...]] [--pin-weight [PIN_WEIGHT]] [--compress-weight] [--compress-cache] completion.py: error: unrecognized arguments: --platform cpu
A proper default without an explicit option would be better.
I'm curious how Apple Neural Engine is fast.
Minimal modification to extend FlexGen to CPU and M1/M2 GPU platforms. Not fully tested with various offloading settings. @Ying1123 @merrymercy