h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
http://h2o.ai
Apache License 2.0
11.24k stars 1.23k forks source link

Error on fresh install #409

Closed wanfuse123 closed 1 year ago

wanfuse123 commented 1 year ago
Loaded 0 sources for potentially adding to UserData
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
device_map: {'': 0}
bin /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
/home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "

/home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
Loading checkpoint shards:   0%|                          | 0/5 [00:01<?, ?it/s]
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /home/top/h2ogpt/generate.py:16 in <module>                                  │
│                                                                              │
│   13                                                                         │
│   14                                                                         │
│   15 if __name__ == "__main__":                                              │
│ ❱ 16 │   entrypoint_main()                                                   │
│   17                                                                         │
│                                                                              │
│ /home/top/h2ogpt/generate.py:12 in entrypoint_main                           │
│                                                                              │
│    9                                                                         │
│   10                                                                         │
│   11 def entrypoint_main():                                                  │
│ ❱ 12 │   fire.Fire(main)                                                     │
│   13                                                                         │
│   14                                                                         │
│   15 if __name__ == "__main__":                                              │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/fire/core.py:141 in     │
│ Fire                                                                         │
│                                                                              │
│   138 │   context.update(caller_globals)                                     │
│   139 │   context.update(caller_locals)                                      │
│   140                                                                        │
│ ❱ 141   component_trace = _Fire(component, args, parsed_flag_args, context,  │
│   142                                                                        │
│   143   if component_trace.HasError():                                       │
│   144 │   _DisplayError(component_trace)                                     │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/fire/core.py:475 in     │
│ _Fire                                                                        │
│                                                                              │
│   472 │     is_class = inspect.isclass(component)                            │
│   473 │                                                                      │
│   474 │     try:                                                             │
│ ❱ 475 │   │   component, remaining_args = _CallAndUpdateTrace(               │
│   476 │   │   │   component,                                                 │
│   477 │   │   │   remaining_args,                                            │
│   478 │   │   │   component_trace,                                           │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/fire/core.py:691 in     │
│ _CallAndUpdateTrace                                                          │
│                                                                              │
│   688 │   loop = asyncio.get_event_loop()                                    │
│   689 │   component = loop.run_until_complete(fn(*varargs, **kwargs))        │
│   690   else:                                                                │
│ ❱ 691 │   component = fn(*varargs, **kwargs)                                 │
│   692                                                                        │
│   693   if treatment == 'class':                                             │
│   694 │   action = trace.INSTANTIATED_CLASS                                  │
│                                                                              │
│ /home/top/h2ogpt/src/gen.py:618 in main                                      │
│                                                                              │
│    615 │   │   │   all_kwargs.update(dict(base_model=base_model1, tokenizer_ │
│    616 │   │   │   │   │   │   │   │      lora_weights=lora_weights1, infere │
│    617 │   │   │   if base_model1 and not login_mode_if_model0:              │
│ ❱  618 │   │   │   │   model0, tokenizer0, device = get_model(reward_type=Fa │
│    619 │   │   │   │   │   │   │   │   │   │   │   │   │      **get_kwargs(g │
│    620 │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   * │
│    621 │   │   │   else:                                                     │
│                                                                              │
│ /home/top/h2ogpt/src/gen.py:937 in get_model                                 │
│                                                                              │
│    934 │   │   return model, tokenizer, device                               │
│    935 │                                                                     │
│    936 │   # get local torch-HF model                                        │
│ ❱  937 │   return get_hf_model(load_8bit=load_8bit,                          │
│    938 │   │   │   │   │   │   load_4bit=load_4bit,                          │
│    939 │   │   │   │   │   │   load_half=load_half,                          │
│    940 │   │   │   │   │   │   infer_devices=infer_devices,                  │
│                                                                              │
│ /home/top/h2ogpt/src/gen.py:1045 in get_hf_model                             │
│                                                                              │
│   1042 │   │   │   │                                                         │
│   1043 │   │   │   │   if infer_devices:                                     │
│   1044 │   │   │   │   │   config, model = get_config(base_model, return_mod │
│ ❱ 1045 │   │   │   │   │   model = get_non_lora_model(base_model, model_load │
│   1046 │   │   │   │   │   │   │   │   │   │   │      config, model,         │
│   1047 │   │   │   │   │   │   │   │   │   │   │      gpu_id=gpu_id,         │
│   1048 │   │   │   │   │   │   │   │   │   │   │      )                      │
│                                                                              │
│ /home/top/h2ogpt/src/gen.py:767 in get_non_lora_model                        │
│                                                                              │
│    764 │   pop_unused_model_kwargs(model_kwargs)                             │
│    765 │                                                                     │
│    766 │   if load_in_8bit or load_in_4bit or not load_half:                 │
│ ❱  767 │   │   model = model_loader.from_pretrained(                         │
│    768 │   │   │   base_model,                                               │
│    769 │   │   │   config=config,                                            │
│    770 │   │   │   **model_kwargs,                                           │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/transformers/models/aut │
│ o/auto_factory.py:484 in from_pretrained                                     │
│                                                                              │
│   481 │   │   │   )                                                          │
│   482 │   │   elif type(config) in cls._model_mapping.keys():                │
│   483 │   │   │   model_class = _get_model_class(config, cls._model_mapping) │
│ ❱ 484 │   │   │   return model_class.from_pretrained(                        │
│   485 │   │   │   │   pretrained_model_name_or_path, *model_args, config=con │
│   486 │   │   │   )                                                          │
│   487 │   │   raise ValueError(                                              │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/transformers/modeling_u │
│ tils.py:2881 in from_pretrained                                              │
│                                                                              │
│   2878 │   │   │   │   mismatched_keys,                                      │
│   2879 │   │   │   │   offload_index,                                        │
│   2880 │   │   │   │   error_msgs,                                           │
│ ❱ 2881 │   │   │   ) = cls._load_pretrained_model(                           │
│   2882 │   │   │   │   model,                                                │
│   2883 │   │   │   │   state_dict,                                           │
│   2884 │   │   │   │   loaded_state_dict_keys,  # XXX: rename?               │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/transformers/modeling_u │
│ tils.py:3228 in _load_pretrained_model                                       │
│                                                                              │
│   3225 │   │   │   │   )                                                     │
│   3226 │   │   │   │                                                         │
│   3227 │   │   │   │   if low_cpu_mem_usage:                                 │
│ ❱ 3228 │   │   │   │   │   new_error_msgs, offload_index, state_dict_index = │
│   3229 │   │   │   │   │   │   model_to_load,                                │
│   3230 │   │   │   │   │   │   state_dict,                                   │
│   3231 │   │   │   │   │   │   loaded_keys,                                  │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/transformers/modeling_u │
│ tils.py:728 in _load_state_dict_into_meta_model                              │
│                                                                              │
│    725 │   │   │   │   fp16_statistics = None                                │
│    726 │   │   │                                                             │
│    727 │   │   │   if "SCB" not in param_name:                               │
│ ❱  728 │   │   │   │   set_module_quantized_tensor_to_device(                │
│    729 │   │   │   │   │   model, param_name, param_device, value=param, fp1 │
│    730 │   │   │   │   )                                                     │
│    731                                                                       │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/transformers/utils/bits │
│ andbytes.py:89 in set_module_quantized_tensor_to_device                      │
│                                                                              │
│    86 │   │   │                                                              │
│    87 │   │   │   kwargs = old_value.__dict__                                │
│    88 │   │   │   if is_8bit:                                                │
│ ❱  89 │   │   │   │   new_value = bnb.nn.Int8Params(new_value, requires_grad │
│    90 │   │   │   elif is_4bit:                                              │
│    91 │   │   │   │   new_value = bnb.nn.Params4bit(new_value, requires_grad │
│    92                                                                        │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/nn/modules │
│ .py:294 in to                                                                │
│                                                                              │
│   291 │   │   │   and device.type == "cuda"                                  │
│   292 │   │   │   and self.data.device.type == "cpu"                         │
│   293 │   │   ):                                                             │
│ ❱ 294 │   │   │   return self.cuda(device)                                   │
│   295 │   │   else:                                                          │
│   296 │   │   │   new_param = Int8Params(                                    │
│   297 │   │   │   │   super().to(                                            │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/nn/modules │
│ .py:258 in cuda                                                              │
│                                                                              │
│   255 │   │   │   # we store the 8-bit rows-major weight                     │
│   256 │   │   │   # we convert this weight to the turning/ampere weight duri │
│   257 │   │   │   B = self.data.contiguous().half().cuda(device)             │
│ ❱ 258 │   │   │   CB, CBt, SCB, SCBt, coo_tensorB = bnb.functional.double_qu │
│   259 │   │   │   del CBt                                                    │
│   260 │   │   │   del SCBt                                                   │
│   261 │   │   │   self.data = CB                                             │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/functional │
│ .py:1987 in double_quant                                                     │
│                                                                              │
│   1984 │   │   rows = A.shape[0]                                             │
│   1985 │                                                                     │
│   1986 │   if row_stats is None or col_stats is None:                        │
│ ❱ 1987 │   │   row_stats, col_stats, nnz_row_ptr = get_colrow_absmax(        │
│   1988 │   │   │   A, threshold=threshold                                    │
│   1989 │   │   )                                                             │
│   1990                                                                       │
│                                                                              │
│ /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/functional │
│ .py:1876 in get_colrow_absmax                                                │
│                                                                              │
│   1873 │                                                                     │
│   1874 │   prev_device = pre_call(A.device)                                  │
│   1875 │   is_on_gpu([A, row_stats, col_stats, nnz_block_ptr])               │
│ ❱ 1876 │   lib.cget_col_row_stats(ptrA, ptrRowStats, ptrColStats, ptrNnzrows │
│   1877 │   post_call(prev_device)                                            │
│   1878 │                                                                     │
│   1879 │   if threshold > 0.0:                                               │
│                                                                              │
│ /usr/lib/python3.10/ctypes/__init__.py:387 in __getattr__                    │
│                                                                              │
│   384 │   def __getattr__(self, name):                                       │
│   385 │   │   if name.startswith('__') and name.endswith('__'):              │
│   386 │   │   │   raise AttributeError(name)                                 │
│ ❱ 387 │   │   func = self.__getitem__(name)                                  │
│   388 │   │   setattr(self, name, func)                                      │
│   389 │   │   return func                                                    │
│   390                                                                        │
│                                                                              │
│ /usr/lib/python3.10/ctypes/__init__.py:392 in __getitem__                    │
│                                                                              │
│   389 │   │   return func                                                    │
│   390 │                                                                      │
│   391 │   def __getitem__(self, name_or_ordinal):                            │
│ ❱ 392 │   │   func = self._FuncPtr((name_or_ordinal, self))                  │
│   393 │   │   if not isinstance(name_or_ordinal, int):                       │
│   394 │   │   │   func.__name__ = name_or_ordinal                            │
│   395 │   │   return func                                                    │
╰──────────────────────────────────────────────────────────────────────────────╯
AttributeError: 
/home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbyte
s_cpu.so: undefined symbol: cget_col_row_stats
pseudotensor commented 1 year ago

Looks like you are using linux(?) but is that in windows?

It's showing it's loading: /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so so bitsandbytes isn't using GPU. Something is wrong with that installation.

You can try: https://github.com/TimDettmers/bitsandbytes/issues/156#issuecomment-1462329713

and also perhaps rely upon conda's cudatoolkit.

conda install cudatoolkit

Let me know if this solves your problem and I can add it to docs.

DhavalWI commented 1 year ago

can you please provide a step-by-step process. I am unable to resolve the issue from the given instructions.

Looks like you are using linux(?) but is that in windows?

It's showing it's loading: /home/top/h2ogpt/h2ogpt/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so so bitsandbytes isn't using GPU. Something is wrong with that installation.

You can try: TimDettmers/bitsandbytes#156 (comment)

and also perhaps rely upon conda's cudatoolkit.

conda install cudatoolkit

Let me know if this solves your problem and I can add it to docs.

can you please provide a step-by-step process. I am unable to resolve the issue from the given instructions.

pseudotensor commented 1 year ago

Did you follow the windows instructions? Those lines w.r.t. cuda toolkit etc. are there.

DhavalWI commented 1 year ago

Did you follow the windows instructions? Those lines w.r.t. cuda toolkit etc. are there.

Yeah i got it to working now had the issue with cuda version on my linux. Thanks!

Mathanraj-Sharma commented 1 year ago

@DhavalWI hope your problem is solved is it okay to close this issue?

DhavalWI commented 1 year ago

@DhavalWI hope your problem is solved is it okay to close this issue?

yes issue resolved. Thanks!

pseudotensor commented 1 year ago

@DhavalWI Any advice for improving docs? Should we say some specific cuda or nvidia driver version is required, what did you have?

pseudotensor commented 1 year ago

I revamped and fully tested the docs for windows, please try again: https://github.com/h2oai/h2ogpt/blob/main/docs/README_WINDOWS.md