Open Timmaaahh opened 3 weeks ago
Same problem. Have you found a solution?
Anyone any suggestions on how to fix this issue when running Omost locally?
Я нашёл решение. Правда, оно на китайском:) https://github.com/lllyasviel/Omost/issues/4#issuecomment-2152292622
Я проделал это у себя и всё заработало. Только у меня не было папки Python в папке с Omost, я работал с папкой venv. В остальном, всё так, как описал наш китайский друг:))
Я перевёл с китайского:
Попробуйте исправить, используя следующее, я исправил ту же проблему так:
Шаг 1. Отредактируйте файл pipeline.py
.
Откройте файл pipeline.py
.
Найдите следующую строку кода:
alphas_cumprod = torch.tensor(np.cumprod(alphas, axis=0), dtype=torch.float32)
Измените её на:
alphas_cumprod = torch.tensor(np.cumprod(alphas, axis=0), dtype=torch.float32).clone().detach()
Шаг 2. Отредактируйте modeling_llama.py.
Найдите файл modeling_llama.py
, обычно он находится в каталоге E:\Omost\python\lib\site-packages\transformers\models\llama\
Найдите следующую строку кода:
causal_mask = torch.triu(causal_mask, diagonal=1)
Измените её на:
causal_mask = torch.triu(causal_mask.to(torch.float32), diagonal=1)
Hey Guys,
When launching locally, and typing my request the program looks like it is loading and generating, but it suddenly stops with the following debug error in the CLI:
Exception in thread Thread-9 (generate): Traceback (most recent call last): File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1009, in _bootstrap_inner self.run() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\threading.py", line 946, in run self._target(*self._args, self._kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 1758, in generate result = self._sample( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 2397, in _sample outputs = self( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 1164, in forward outputs = self.model( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 968, in forward layer_outputs = decoder_layer( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 713, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 649, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: cutlassF: no kernel found to launch! Traceback (most recent call last): File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\queueing.py", line 528, in process_events response = await route_utils.call_process_api( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\route_utils.py", line 270, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1908, in process_api result = await self.call_function( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1497, in call_function prediction = await utils.async_iteration(iterator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 632, in async_iteration return await iterator.anext() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 758, in asyncgen_wrapper response = await iterator.anext() File "C:\Users\timde\Omost\chat_interface.py", line 554, in _stream_fn first_response, first_interrupter = await async_iteration(generator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 632, in async_iteration return await iterator.anext() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 625, in anext return await anyio.to_thread.run_sync( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 608, in run_sync_iterator_async return next(iterator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 35, in generator_context response = gen.send(None) File "C:\Users\timde\Omost\gradio_app.py", line 164, in chat_fn for text in streamer: File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\streamers.py", line 223, in next value = self.text_queue.get(timeout=self.timeout) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\queue.py", line 179, in get raise Empty _queue.Empty Last assistant response is not valid canvas: expected string or bytes-like object The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:128001 for open-end generation. Exception in thread Thread-10 (generate): Traceback (most recent call last): File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1009, in _bootstrap_inner self.run() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\threading.py", line 946, in run self._target(self._args, self._kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 1758, in generate result = self._sample( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\utils.py", line 2397, in _sample outputs = self( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 1164, in forward outputs = self.model( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 968, in forward layer_outputs = decoder_layer( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 713, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\hooks.py", line 166, in new_forward output = module._old_forward(args, kwargs) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\llama\modeling_llama.py", line 649, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: cutlassF: no kernel found to launch! Traceback (most recent call last): File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\queueing.py", line 528, in process_events response = await route_utils.call_process_api( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\route_utils.py", line 270, in call_process_api output = await app.get_blocks().process_api( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1908, in process_api result = await self.call_function( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1497, in call_function prediction = await utils.async_iteration(iterator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 632, in async_iteration return await iterator.anext() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 758, in asyncgen_wrapper response = await iterator.anext() File "C:\Users\timde\Omost\chat_interface.py", line 554, in _stream_fn first_response, first_interrupter = await async_iteration(generator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 632, in async_iteration return await iterator.anext() File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 625, in anext return await anyio.to_thread.run_sync( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, *args) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\utils.py", line 608, in run_sync_iterator_async return next(iterator) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 35, in generator_context response = gen.send(None) File "C:\Users\timde\Omost\gradio_app.py", line 164, in chat_fn for text in streamer: File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\generation\streamers.py", line 223, in next value = self.text_queue.get(timeout=self.timeout) File "C:\Users\timde\AppData\Local\Programs\Python\Python310\lib\queue.py", line 179, in get raise Empty _queue.Empty Last assistant response is not valid canvas: expected string or bytes-like objectAnyone any suggestions on how to fix this issue when running Omost locally? Thanks in advance.