Open fbauer-kunbus opened 2 hours ago
SAME:
(omnigen_env) PS C:\OmniGen> python app.py
Fetching 10 files: 100%|███████████████████████████████████████████████████████████████████████| 10/10 [00:00<?, ?it/s]
Loading safetensors
Traceback (most recent call last):
File "C:\OmniGen\app.py", line 9, in
Hi, @fbauer-kunbus @RodgerE1 , this is a bug, and I will fix it. However, this issue arises when your machine doesn't have a GPU and can only use the CPU for inference, which results in very slow generation speeds. We do not recommend using the CPU for inference.
Hi, @fbauer-kunbus @RodgerE1 , this is a bug, and I will fix it. However, this issue arises when your machine doesn't have a GPU and can only use the CPU for inference, which results in very slow generation speeds. We do not recommend using the CPU for inference.
No, this can't be it - I have a laptop 4090 with 16GB VRAM that is running all my AI apps succesfully for a long time, even in same cuda venv environments - never had this issue before.
same here, I do have a gpu but still I get same error, I have a 4070
I updated the code. You can clone the latest code and try it.
The first error has been solved, but still getting one
Traceback (most recent call last):
File "D:\Work\OmniGen\omni\Lib\site-packages\gradio\queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\gradio\route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\gradio\blocks.py", line 2018, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\gradio\blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\anyio\_backends\_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\anyio\_backends\_asyncio.py", line 943, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\gradio\utils.py", line 846, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\app.py", line 22, in generate_image
output = pipe(
^^^^^
File "D:\Work\OmniGen\omni\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\OmniGen\pipeline.py", line 278, in __call__
samples = scheduler(latents, func, model_kwargs, use_kv_cache=use_kv_cache, offload_kv_cache=offload_kv_cache)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\OmniGen\scheduler.py", line 156, in __call__
cache = [OmniGenCache(num_tokens_for_img, offload_kv_cache) for _ in range(len(model_kwargs['input_ids']))] if use_kv_cache else None
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Work\OmniGen\OmniGen\scheduler.py", line 14, in __init__
raise RuntimeError("OffloadedCache can only be used with a GPU")
RuntimeError: OffloadedCache can only be used with a GPU
if this might help
nvidia-smi
Fri Nov 1 11:11:45 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 566.03 Driver Version: 566.03 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 WDDM | 00000000:01:00.0 On | N/A |
| 0% 34C P0 34W / 200W | 2416MiB / 12282MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
Same here ... After updating I now get over the original error - but when trying to create an image it now returns:
OffloadedCache can only be used with a GPU
But it doesn't look as if it is trying to use the GPU at all ...
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.80 Driver Version: 546.80 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 ... WDDM | 00000000:01:00.0 On | N/A |
| N/A 50C P8 12W / 175W | 1882MiB / 16376MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
same here
When starting python app.py I get this error:
NameError: name 'is_torch_npu_available' is not defined. Did you mean: 'is_torch_xla_available'?