error while running gradio_ui

janoschsimon commented 4 weeks ago

hey there i managed to get it installed on vast.ai with 4xH100 when i run the gradio_ui.py i get this error:

2024-10-25 06:12:38,514 WARNING worker.py:1481 -- SIGTERM handler is not set because current thread is not the main thread.
Traceback (most recent call last):
  File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/routes.py", line 439, in run_predict
    output = await app.get_blocks().process_api(
  File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1384, in process_api
    result = await self.call_function(
  File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1089, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/utils.py", line 700, in wrapper
    response = f(*args, **kwargs)
  File "/workspace/models/src/mochi_preview/infer.py", line 73, in generate_video
    load_model()
  File "/workspace/models/src/mochi_preview/infer.py", line 28, in load_model
    ray.init()
  File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 1653, in init
    raise RuntimeError(
RuntimeError: Maybe you called ray.init twice by accident? This error can be suppressed by passing in 'ignore_reinit_error=True' or by calling 'ray.shutdown()' prior to 'ray.init()'.

it would also be good if you ad demo.launch() demo.launch(share=True) as i guess not that many people have 4xH100 laying around at home hehe :D was sooo close to my first generation damn :D

cheers and thx janosch

ved-genmo commented 3 weeks ago

If this happens call ray stop on your CLI.

janoschsimon commented 3 weeks ago

ok but now with the updated code i get

(T2VSynthMochiModel pid=4512) Timing init_process_group (can take 20-30 seconds) (T2VSynthMochiModel pid=4513) Timing load_text_encs (T2VSynthMochiModel pid=4513) Timing load_vae (T2VSynthMochiModel pid=4513) Timing init_process_group (can take 20-30 seconds) (T2VSynthMochiModel pid=4512) Timing load_text_encs (T2VSynthMochiModel pid=4513) Timing construct_dit (T2VSynthMochiModel pid=4513) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::T2VSynthMochiModel.__init__() (pid=4513, ip=172.17.0.3, actor_id=226cd3320f9f844251edf18201000000, repr=<mochi_preview.t2v_synth_mochi.T2VSynthMochiModel object at 0x708890b118a0>) (T2VSynthMochiModel pid=4513) File "/workspace/models/src/mochi_preview/t2v_synth_mochi.py", line 250, in __init__ (T2VSynthMochiModel pid=4513) with open(dit_config_path, "r") as f: (T2VSynthMochiModel pid=4513) FileNotFoundError: [Errno 2] No such file or directory: '/workspace/models//dit-config.yaml' Traceback (most recent call last): File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/queueing.py", line 624, in process_events response = await route_utils.call_process_api( File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api output = await app.get_blocks().process_api( File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 2018, in process_api result = await self.call_function( File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1567, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread return await future File "/workspace/models/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run result = context.run(func, *args) File "/workspace/models/.venv/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper response = f(*args, **kwargs) File "/workspace/models/src/mochi_preview/infer.py", line 73, in generate_video load_model() File "/workspace/models/src/mochi_preview/infer.py", line 38, in load_model model = MochiWrapper( File "/workspace/models/src/mochi_preview/handler.py", line 25, in __init__ ray.get(worker.__ray_ready__.remote()) File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper return fn(*args, **kwargs) File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 2745, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) File "/workspace/models/.venv/lib/python3.10/site-packages/ray/_private/worker.py", line 903, in get_objects raise value ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::T2VSynthMochiModel.__init__() (pid=4513, ip=172.17.0.3, actor_id=226cd3320f9f844251edf18201000000, repr=<mochi_preview.t2v_synth_mochi.T2VSynthMochiModel object at 0x708890b118a0>) File "/workspace/models/src/mochi_preview/t2v_synth_mochi.py", line 250, in __init__ with open(dit_config_path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/workspace/models//dit-config.yaml'

vedantroy commented 3 weeks ago

Can you do git pull and try again? I pushed a bunch of changes to the repository, so that file should no longer matter.

janoschsimon commented 3 weeks ago

no so easy as renting 4xh100 is not cheap and loos the install with every session is there a way to use a preinstalls flash-attn because the installation takes from 30 to 90 min of wasted time also the t5 model would nice to download in adavanc e:)

thx

genmoai / mochi

error while running gradio_ui #26