henk717 / KoboldAI

KoboldAI is generative AI software optimized for fictional use, but capable of much more!
http://koboldai.com
GNU Affero General Public License v3.0
359 stars 130 forks source link

Error when using IPEX #456

Closed Jacoby1218 closed 11 months ago

Jacoby1218 commented 12 months ago

Traceback (most recent call last):
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "aiserver.py", line 906, in g
    return f(*args, **kwargs)
  File "aiserver.py", line 828, in decorated
    response = f(schema, *args, **kwargs)
  File "aiserver.py", line 811, in decorated
    raise e
  File "aiserver.py", line 802, in decorated
    return f(*args, **kwargs)
  File "aiserver.py", line 8494, in post_generate
    return _generate_text(body)
  File "aiserver.py", line 8359, in _generate_text
    genout = apiactionsubmit(body.prompt, use_memory=body.use_memory, use_story=body.use_story, use_world_info=body.use_world_info, use_authors_note=body.use_authors_note)
  File "aiserver.py", line 3576, in apiactionsubmit
    genout = apiactionsubmit_generate(tokens, minimum, maximum)
  File "aiserver.py", line 3467, in apiactionsubmit_generate
    _genout, already_generated = tpool.execute(model.core_generate, txt, set())
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/eventlet/tpool.py", line 132, in execute
    six.reraise(c, e, tb)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/six.py", line 719, in reraise
    raise value
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/eventlet/tpool.py", line 86, in tworker
    rv = meth(*args, **kwargs)
  File "/home/jacob/KoboldAI/modeling/inference_model.py", line 356, in core_generate
    result = self.raw_generate(
  File "/home/jacob/KoboldAI/modeling/inference_model.py", line 629, in raw_generate
    result = self._raw_generate(
  File "/home/jacob/KoboldAI/modeling/inference_models/hf_torch.py", line 344, in _raw_generate
    genout = self.model.generate(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
    return self.sample(
  File "/home/jacob/KoboldAI/modeling/inference_models/hf_torch.py", line 267, in new_sample
    return new_sample.old_sample(self, *args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/utils.py", line 2730, in sample
    outputs = self(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 820, in forward
    outputs = self.model(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 708, in forward
    layer_outputs = decoder_layer(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 421, in forward
    hidden_states = self.input_layernorm(hidden_states)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 160, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 282, in pre_forward
    set_module_tensor_to_device(module, name, self.execution_device, value=self.weights_map[name])
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/utils/offload.py", line 123, in __getitem__
    return self.dataset[f"{self.prefix}{key}"]
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/utils/offload.py", line 176, in __getitem__
    logger.info("Enabling fast loading with safetensors by setting `SAFETENSORS_FAST_GPU` to 1.")
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/logging/__init__.py", line 1806, in info
    self.log(INFO, msg, *args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/logging.py", line 51, in log
    raise RuntimeError(
RuntimeError: You must initialize the accelerate state by calling either `PartialState()` or `Accelerator()` before using the logging utility.```
Disty0 commented 12 months ago

I can't replicate this. Also please put the full logs.

Pushed general fixes for IPEX, can you try again?

https://github.com/henk717/KoboldAI/pull/457

Jacoby1218 commented 12 months ago

those fixes seemingly fixed this issue, now getting RuntimeError: The size of tensor a (2048) must match the size of tensor b (3313) at non-singleton dimension 3 (tried Pygmalion 2.7b since no quantization means it's one of the only ones i can fit into vram)

Jacoby1218 commented 12 months ago

onednn_verbose,info,cpu,runtime:threadpool,nthr:8
onednn_verbose,info,cpu,isa:Intel AVX2
onednn_verbose,info,gpu,runtime:DPC++
onednn_verbose,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Graphics [0x56a0],driver_version:1.3.26241,binary_kernels:enabled
onednn_verbose,info,experimental features are enabled
onednn_verbose,info,use batch_normalization stats one pass is enabled
onednn_verbose,info,prim_template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,error,ocl,Error during the build of OpenCL program. Build log:

onednn_verbose,error,ocl,errcode -30,CL_INVALID_VALUE,src/gpu/ocl/ocl_gpu_engine.cpp:291
2023-09-07 08:57:37,478 - KoboldAI - ERROR - Exception on /api/v1/generate [POST]
Traceback (most recent call last):
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "aiserver.py", line 906, in g
    return f(*args, **kwargs)
  File "aiserver.py", line 828, in decorated
    response = f(schema, *args, **kwargs)
  File "aiserver.py", line 811, in decorated
    raise e
  File "aiserver.py", line 802, in decorated
    return f(*args, **kwargs)
  File "aiserver.py", line 8494, in post_generate
    return _generate_text(body)
  File "aiserver.py", line 8359, in _generate_text
    genout = apiactionsubmit(body.prompt, use_memory=body.use_memory, use_story=body.use_story, use_world_info=body.use_world_info, use_authors_note=body.use_authors_note)
  File "aiserver.py", line 3576, in apiactionsubmit
    genout = apiactionsubmit_generate(tokens, minimum, maximum)
  File "aiserver.py", line 3467, in apiactionsubmit_generate
    _genout, already_generated = tpool.execute(model.core_generate, txt, set())
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/eventlet/tpool.py", line 132, in execute
    six.reraise(c, e, tb)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/six.py", line 719, in reraise
    raise value
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/eventlet/tpool.py", line 86, in tworker
    rv = meth(*args, **kwargs)
  File "/home/jacob/KoboldAI/modeling/inference_model.py", line 356, in core_generate
    result = self.raw_generate(
  File "/home/jacob/KoboldAI/modeling/inference_model.py", line 629, in raw_generate
    result = self._raw_generate(
  File "/home/jacob/KoboldAI/modeling/inference_models/hf_torch.py", line 344, in _raw_generate
    genout = self.model.generate(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
    return self.sample(
  File "/home/jacob/KoboldAI/modeling/inference_models/hf_torch.py", line 267, in new_sample
    return new_sample.old_sample(self, *args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/generation/utils.py", line 2730, in sample
    outputs = self(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/opt/modeling_opt.py", line 944, in forward
    outputs = self.model.decoder(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/opt/modeling_opt.py", line 710, in forward
    layer_outputs = decoder_layer(
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/transformers/models/opt/modeling_opt.py", line 353, in forward
    hidden_states = self.fc1(hidden_states)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/jacob/KoboldAI/modeling/ipex/hijacks.py", line 22, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "/home/jacob/KoboldAI/modeling/ipex/hijacks.py", line 33, in __call__
    return self.__orig_func(*args, **kwargs)
  File "/home/jacob/KoboldAI/runtime/envs/koboldai-ipex/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: could not create a primitive```
Jacoby1218 commented 12 months ago

OK I give up, seems something is either fundamentally broken with my WSL2 install or IPEX is just a mess. seem to be getting output now, but it's quite broken image

Jacoby1218 commented 12 months ago

turns out that for some reason, my input being as long as it is (somewhere around 3k tokens from SillyTavern) is causing crashes. tried a character card with no context and it worked. think i may be silently running out of vram. output is still not working correctly though, but it is at least outputting something now

Disty0 commented 11 months ago

Max token length for Pygmalion 2.7b is 2048.

 "model_max_length": 2048,

https://huggingface.co/PygmalionAI/pygmalion-2.7b/blob/main/tokenizer_config.json

Jacoby1218 commented 11 months ago

oh so this doesn't do rope like llama.cpp does, noted. Problem was me then, i got it working.