williamyang1991 / FRESCO

[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
https://www.mmlab-ntu.com/project/fresco/
Other
736 stars 72 forks source link

Does it support LCM models? #34

Closed G-force78 closed 8 months ago

G-force78 commented 8 months ago

....

williamyang1991 commented 8 months ago

I'm not sure...

G-force78 commented 8 months ago

Ive tried modifying it with no success yet

Traceback (most recent call last): File "/content/FRESCO/run_fresco.py", line 322, in keys = run_keyframe_translation(config) File "/content/FRESCO/run_fresco.py", line 245, in run_keyframe_translation latents = inference(pipe, controlnet, frescoProc, File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/content/FRESCO/src/pipe_FRESCO.py", line 227, in inference latents = step(pipe, noise_pred, t, latents, generator, File "/content/FRESCO/src/pipe_FRESCO.py", line 22, in step prev_timestep = scheduler.previous_timestep(timestep) File "/usr/local/lib/python3.10/dist-packages/diffusers/configuration_utils.py", line 137, in getattr raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") AttributeError: 'LCMScheduler' object has no attribute 'previous_timestep'

Could have something to with this? https://huggingface.co/docs/diffusers/main/en/api/schedulers/lcm

timesteps (List[int], optional) — Custom timesteps used to support arbitrary spacing between timesteps. If None, then the default timestep spacing strategy of equal spacing between timesteps on the training/distillation timestep schedule is used. If timesteps is passed, num_inference_steps must be None.

G-force78 commented 8 months ago

Works now using this, but not much quicker, could even be slower at 13.03s/it (on t4 colab batch 3)

diffusion model

vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16) pipe = AutoPipelineForText2Image.from_pretrained(config['sd_path'], vae=vae, torch_dtype=torch.float16) pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) pipe = pipe.to("cuda") pipe.scheduler.set_timesteps(config['num_inference_steps'], device=pipe._execution_device)

G-force78 commented 8 months ago

For some reason just before it generated the last two key frames it ran out of memory, just on a batch of 3. So I just used what it had and it produced the video.

Processing sequences: 100% 4/4 [07:36<00:00, 114.22s/it] Processing sequences: 100% 4/4 [07:43<00:00, 115.97s/it] Processing sequences: 100% 4/4 [08:09<00:00, 122.31s/it] Processing sequences: 100% 4/4 [08:14<00:00, 123.69s/it] ebsynth: 502.62698101997375 Processing sequences: 0% 0/16 [00:00<?, ?it/s] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:06<00:00, 6.99s/it] others: 7.005864381790161 Processing sequences: 6% 1/16 [00:07<01:48, 7.22s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.71s/it] others: 4.723775148391724 Processing sequences: 12% 2/16 [00:12<01:21, 5.85s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.93s/it] others: 3.937303066253662 Processing sequences: 19% 3/16 [00:16<01:06, 5.09s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.84s/it] others: 3.8541078567504883 Processing sequences: 25% 4/16 [00:20<00:56, 4.67s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.86s/it] others: 4.868426561355591 Processing sequences: 31% 5/16 [00:25<00:52, 4.80s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.81s/it] others: 3.8173863887786865 Processing sequences: 38% 6/16 [00:29<00:45, 4.53s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.90s/it] others: 3.91196870803833 Processing sequences: 44% 7/16 [00:33<00:39, 4.38s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.96s/it] others: 4.9716761112213135 Processing sequences: 50% 8/16 [00:38<00:36, 4.62s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.82s/it] others: 3.827073097229004 Processing sequences: 56% 9/16 [00:42<00:30, 4.43s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:03<00:00, 3.91s/it] others: 3.919020175933838 Processing sequences: 62% 10/16 [00:46<00:25, 4.32s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.81s/it] others: 4.823921203613281 Processing sequences: 69% 11/16 [00:51<00:22, 4.53s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.21s/it] others: 4.224242687225342 Processing sequences: 75% 12/16 [00:56<00:17, 4.49s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.10s/it] others: 4.110637664794922 Processing sequences: 81% 13/16 [01:00<00:13, 4.42s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:05<00:00, 5.31s/it] others: 5.322577476501465 Processing sequences: 88% 14/16 [01:05<00:09, 4.75s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.14s/it] others: 4.146374702453613 Processing sequences: 94% 15/16 [01:10<00:04, 4.62s/it] Processing frames: 0% 0/1 [00:00<?, ?it/s] Processing frames: 100% 1/1 [00:04<00:00, 4.84s/it] others: 4.854410171508789 Processing sequences: 100% 16/16 [01:15<00:00, 4.70s/it]

williamyang1991 commented 8 months ago

Based on your experiment, then maybe our method is not directly compatible with LCM.

G-force78 commented 8 months ago

Based on your experiment, then maybe our method is not directly compatible with LCM.

yes, not a very good resulting video either although no doubt mostly due to the prompt and model. the consistency looks pretty good though especially the background

https://github.com/williamyang1991/FRESCO/assets/114336644/e2b15aff-3b41-4123-b1b7-325c130ef36a