[Apple M1 Max] TypeError: object.__new__() takes exactly one argument (the type to instantiate)

I am on Apple M1 Max:

$ git clone https://github.com/FMInference/FlexGen.git
$ cd FlexGen
$ conda create -n flexgen python=3.10
$ conda activate flexgen
$ pip install .
$ python -m flexgen.flex_opt --model facebook/opt-1.3b 
Downloading (…)okenizer_config.json: 100%|██████| 685/685 [00:00<00:00, 273kB/s]
Downloading (…)lve/main/config.json: 100%|██████| 651/651 [00:00<00:00, 172kB/s]
Downloading (…)olve/main/vocab.json: 100%|███| 899k/899k [00:00<00:00, 1.91MB/s]
Downloading (…)olve/main/merges.txt: 100%|███| 456k/456k [00:00<00:00, 1.19MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████| 221/221 [00:00<00:00, 60.4kB/s]
Exception in thread Thread-2 (copy_worker_func):
Traceback (most recent call last):
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-3 (copy_worker_func):
Traceback (most recent call last):
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-4 (copy_worker_func):
Traceback (most recent call last):
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-5 (copy_worker_func):
Traceback (most recent call last):
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 953, in run
    self.run()
    self.run()
Traceback (most recent call last):
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    self._target(*self._args, **self._kwargs)
  File "/Users/ondrej/repos/FlexGen/flexgen/pytorch_backend.py", line 879, in copy_worker_func
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 953, in run
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 953, in run
    self.run()
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/threading.py", line 953, in run
    return _run_code(code, main_globals, None,
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/runpy.py", line 86, in _run_code
    self._target(*self._args, **self._kwargs)
  File "/Users/ondrej/repos/FlexGen/flexgen/pytorch_backend.py", line 879, in copy_worker_func
    torch.cuda.set_device(cuda_id)
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
    self._target(*self._args, **self._kwargs)
  File "/Users/ondrej/repos/FlexGen/flexgen/pytorch_backend.py", line 879, in copy_worker_func
    self._target(*self._args, **self._kwargs)
  File "/Users/ondrej/repos/FlexGen/flexgen/pytorch_backend.py", line 879, in copy_worker_func
    torch.cuda.set_device(cuda_id)
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
    exec(code, run_globals)
  File "/Users/ondrej/repos/FlexGen/flexgen/flex_opt.py", line 1308, in <module>
    torch.cuda.set_device(cuda_id)
    torch._C._cuda_setDevice(device)
    torch._C._cuda_setDevice(device)
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    torch.cuda.set_device(cuda_id)
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    torch._C._cuda_setDevice(device)
    run_flexgen(args)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
  File "/Users/ondrej/repos/FlexGen/flexgen/flex_opt.py", line 1190, in run_flexgen
    torch._C._cuda_setDevice(device)
    model = OptLM(opt_config, env, args.path, policy)
  File "/Users/ondrej/repos/FlexGen/flexgen/flex_opt.py", line 612, in __init__
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    self.load_weight_stream = torch.cuda.Stream()
  File "/Users/ondrej/mambaforge/envs/flexgen/lib/python3.10/site-packages/torch/cuda/streams.py", line 34, in __new__
    return super(Stream, cls).__new__(cls, priority=priority, **kwargs)
TypeError: object.__new__() takes exactly one argument (the type to instantiate)

I am assuming it by default uses NVIDIA GPU which I don't have, so it fails. In that case it should give a user friendly error message.

FMInference / FlexLLMGen

[Apple M1 Max] TypeError: object.new() takes exactly one argument (the type to instantiate) #62

FMInference / FlexLLMGen

[Apple M1 Max] TypeError: object.__new__() takes exactly one argument (the type to instantiate) #62

[Apple M1 Max] TypeError: object.new() takes exactly one argument (the type to instantiate) #62