started but nothing happening

ptits commented 4 days ago

python3 -m mochi_preview.infer --prompt "A hand with delicate fingers picks up a bright yellow lemon from a wooden bowl filled with lemons and sprigs of mint against a peach-colored background. The hand gently tosses the lemon up and catches it, showcasing its smooth texture. A beige string bag sits beside the bowl, adding a rustic touch to the scene. Additional lemons, one halved, are scattered around the base of the bowl. The even lighting enhances the vibrant colors and creates a fresh, inviting atmosphere." --seed 1710977262 --cfg_scale 4.5 --model_dir . 2024-10-22 21:37:02,815 INFO worker.py:1786 -- Started a local Ray instance. (T2VSynthMochiModel pid=911897) Timing init_process_group (T2VSynthMochiModel pid=911979) Timing load_text_encs (T2VSynthMochiModel pid=911901) Timing init_process_group [repeated 7x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)

and silence...

0   N/A  N/A    911901      C   ray::T2VSynthMochiModel.__init__           1366MiB |

| 0 N/A N/A 3254768 C venv/bin/python 5334MiB | | 1 N/A N/A 911980 C ray::T2VSynthMochiModel.init 1510MiB | | 2 N/A N/A 911897 C ray::T2VSynthMochiModel.init 1510MiB | | 3 N/A N/A 911896 C ray::T2VSynthMochiModel.init 1510MiB | | 4 N/A N/A 911891 C ray::T2VSynthMochiModel.init 1510MiB | | 5 N/A N/A 911898 C ray::T2VSynthMochiModel.init 1510MiB | | 6 N/A N/A 911979 C ray::T2VSynthMochiModel.init 1510MiB | | 7 N/A N/A 911902 C ray::T2VSynthMochiModel.init 1366MiB |

ptits commented 4 days ago

It is freezing here:

dist.init_process_group( "nccl", rank=local_rank, world_size=world_size, device_id=self.device, # force non-lazy init )

ved-genmo commented 4 days ago

this will take about 2-40 seconds to initialize. Let me know if after waiting that long, if the issue still is occurring.

ptits commented 4 days ago

With my debug prints it always prints rank=2 and never goes to text encoder.

bramvera commented 3 days ago

yup me too

WARNING: Mochi requires at least 4xH100 GPUs, but only 2 GPU(s) are available.
Launching with 2 GPUs.
(T2VSynthMochiModel pid=41672) Timing init_process_group (can take 20-30 seconds)
(T2VSynthMochiModel pid=41590) Timing init_process_group (can take 20-30 seconds)
(T2VSynthMochiModel pid=41672) Timing load_text_encs
(T2VSynthMochiModel pid=41590) Timing load_text_encs

nothing happened more than 5 minutes

EDIT: seems like the script is downloading /models--google--t5-v1_1-xxl/ it will be better if we can see the progress

genmoai / models

started but nothing happening #6