chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.06k stars 60 forks source link

support animatediff #87

Open ghost opened 6 months ago

ghost commented 6 months ago

Please consider support animatediff which is also based on diffusers.

chengzeyi commented 6 months ago

@nick008a Did you try that with stable-fast. And could stable-fast support that out of the box? Or you experienced unexpected exceptions?

hihei commented 4 months ago

运行animatediff的时候出现下面的问题,请问有较好的解决方案么,多谢~ 环境: windows image

torch                  2.1.2+cu121
torchaudio             2.1.2
torchvision            0.16.2
stable-fast            1.0.3                         d:\simple_work\sd_opt\soft\stable-fast\src

现象: 运行stable Diffusion 的文生图正常,执行animatediff时报错,现象是出现大量的数字,如果将enable_jit设置为False,则正常推理,不过速度下降很明显 运行时使用的代码

import imageio
import requests
import torch
from diffusers import DDIMScheduler, MotionAdapter, ControlNetModel, \
    AnimateDiffPipeline
from diffusers.utils import export_to_video
from io import BytesIO
from PIL import Image

import torch
from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
from diffusers.utils import export_to_gif

adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
# pipe = AnimateDiffPipeline.from_pretrained("frankjoshua/toonyou_beta6", motion_adapter=adapter)
pipe = AnimateDiffPipeline.from_pretrained("SG161222/Realistic_Vision_V5.1_noVAE", motion_adapter=adapter)
pipe.scheduler = DDIMScheduler(beta_schedule="linear", steps_offset=1, clip_sample=False)
# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()

from sfast.compilers.diffusion_pipeline_compiler import compile, CompilationConfig

config = CompilationConfig.Default()
config.enable_xformers = True
# config.enable_triton = True
config.enable_cuda_graph = True
# config.enable_jit = False
pipe = compile(pipe, config)

output = pipe(prompt="A corgi walking in the park")
frames = output.frames[0]
export_to_gif(frames, "animation.gif")

image

stone002 commented 3 months ago

@hihei 您好,请问您解决了吗,我遇到了同样的问题

hihei commented 2 months ago

@hihei 您好,请问您解决了吗,我遇到了同样的问题

没有解决

Hongtao-Xu commented 2 months ago

I have this when config.enable_jit = True:

/home/oppoer/.local/lib/python3.8/site-packages/torch/cuda/graphs.py:81: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:224.)
  super().capture_end()
/home/oppoer/.local/lib/python3.8/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/utils/flat_tensors.py:275: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return super().__new__(cls, x, *args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
  0%|                                                                                                                                                                                    | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "test_AnimationDiff_SF.py", line 259, in <module>
    warmup()
  File "test_AnimationDiff_SF.py", line 245, in warmup
    inference(model, text)
  File "test_AnimationDiff_SF.py", line 222, in inference
    output_frames = model(**kwarg_inputs).frames
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/pipelines/animatediff/pipeline_animatediff.py", line 802, in __call__
    noise_pred = self.unet(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 40, in dynamic_graphed_callable
    cached_callable = simple_make_graphed_callable(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 61, in simple_make_graphed_callable
    return make_graphed_callable(func,
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 90, in make_graphed_callable
    func(*tree_copy(example_inputs, detach=True),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 51, in wrapper
    traced_m, call_helper = trace_with_kwargs(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 25, in trace_with_kwargs
    traced_module = better_trace(TraceablePosArgOnlyModuleWrapper(func),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/utils.py", line 35, in better_trace
    script_module = torch.jit.trace(func, *args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 806, in trace
    return trace_module(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 1074, in trace_module
    module._c._create_method_from_trace(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 154, in forward
    outputs = self.module(*orig_args, **orig_kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 89, in forward
    return self.func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/models/unets/unet_motion_model.py", line 834, in forward
    emb = emb.repeat_interleave(repeats=num_frames, dim=0)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py", line 21, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
hihei commented 2 months ago

I have this when config.enable_jit = True:

/home/oppoer/.local/lib/python3.8/site-packages/torch/cuda/graphs.py:81: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:224.)
  super().capture_end()
/home/oppoer/.local/lib/python3.8/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/utils/flat_tensors.py:275: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return super().__new__(cls, x, *args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
  0%|                                                                                                                                                                                    | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "test_AnimationDiff_SF.py", line 259, in <module>
    warmup()
  File "test_AnimationDiff_SF.py", line 245, in warmup
    inference(model, text)
  File "test_AnimationDiff_SF.py", line 222, in inference
    output_frames = model(**kwarg_inputs).frames
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/pipelines/animatediff/pipeline_animatediff.py", line 802, in __call__
    noise_pred = self.unet(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 40, in dynamic_graphed_callable
    cached_callable = simple_make_graphed_callable(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 61, in simple_make_graphed_callable
    return make_graphed_callable(func,
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 90, in make_graphed_callable
    func(*tree_copy(example_inputs, detach=True),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 51, in wrapper
    traced_m, call_helper = trace_with_kwargs(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 25, in trace_with_kwargs
    traced_module = better_trace(TraceablePosArgOnlyModuleWrapper(func),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/utils.py", line 35, in better_trace
    script_module = torch.jit.trace(func, *args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 806, in trace
    return trace_module(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 1074, in trace_module
    module._c._create_method_from_trace(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 154, in forward
    outputs = self.module(*orig_args, **orig_kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 89, in forward
    return self.func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/models/unets/unet_motion_model.py", line 834, in forward
    emb = emb.repeat_interleave(repeats=num_frames, dim=0)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py", line 21, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

你看是不是你的输入源没有转到cuda上导致的。

Hongtao-Xu commented 2 months ago

I have this when config.enable_jit = True:

/home/oppoer/.local/lib/python3.8/site-packages/torch/cuda/graphs.py:81: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:224.)
  super().capture_end()
/home/oppoer/.local/lib/python3.8/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/utils/flat_tensors.py:275: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return super().__new__(cls, x, *args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
  0%|                                                                                                                                                                                    | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "test_AnimationDiff_SF.py", line 259, in <module>
    warmup()
  File "test_AnimationDiff_SF.py", line 245, in warmup
    inference(model, text)
  File "test_AnimationDiff_SF.py", line 222, in inference
    output_frames = model(**kwarg_inputs).frames
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/pipelines/animatediff/pipeline_animatediff.py", line 802, in __call__
    noise_pred = self.unet(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 40, in dynamic_graphed_callable
    cached_callable = simple_make_graphed_callable(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 61, in simple_make_graphed_callable
    return make_graphed_callable(func,
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 90, in make_graphed_callable
    func(*tree_copy(example_inputs, detach=True),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 51, in wrapper
    traced_m, call_helper = trace_with_kwargs(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 25, in trace_with_kwargs
    traced_module = better_trace(TraceablePosArgOnlyModuleWrapper(func),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/utils.py", line 35, in better_trace
    script_module = torch.jit.trace(func, *args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 806, in trace
    return trace_module(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 1074, in trace_module
    module._c._create_method_from_trace(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 154, in forward
    outputs = self.module(*orig_args, **orig_kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 89, in forward
    return self.func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/models/unets/unet_motion_model.py", line 834, in forward
    emb = emb.repeat_interleave(repeats=num_frames, dim=0)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py", line 21, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

你看是不是你的输入源没有转到cuda上导致的。

model有转到GPU上:

    # adapter
    adapter = MotionAdapter.from_pretrained(adapter_model, torch_dtype=torch.float16)
    # pipline
    model = AnimateDiffPipeline.from_pretrained(pipeline_model, motion_adapter=adapter, torch_dtype=torch.float16)

    scheduler = DDIMScheduler.from_pretrained(
        pipeline_model,
        subfolder="scheduler",
        clip_sample=False,
        timestep_spacing="linspace",
        beta_schedule="linear",
        steps_offset=1,
    )
    model.scheduler = scheduler
    model.enable_vae_slicing()
    model.to(torch.device('cuda'))

我这里是否出现上面的报错和config.enable_jit = True有关

hihei commented 2 months ago

I have this when config.enable_jit = True:

/home/oppoer/.local/lib/python3.8/site-packages/torch/cuda/graphs.py:81: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:224.)
  super().capture_end()
/home/oppoer/.local/lib/python3.8/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
  warnings.warn(
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/utils/flat_tensors.py:275: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return super().__new__(cls, x, *args, **kwargs)
/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py:21: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  return func(*args, **kwargs)
  0%|                                                                                                                                                                                    | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "test_AnimationDiff_SF.py", line 259, in <module>
    warmup()
  File "test_AnimationDiff_SF.py", line 245, in warmup
    inference(model, text)
  File "test_AnimationDiff_SF.py", line 222, in inference
    output_frames = model(**kwarg_inputs).frames
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/pipelines/animatediff/pipeline_animatediff.py", line 802, in __call__
    noise_pred = self.unet(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 40, in dynamic_graphed_callable
    cached_callable = simple_make_graphed_callable(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 61, in simple_make_graphed_callable
    return make_graphed_callable(func,
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/cuda/graphs.py", line 90, in make_graphed_callable
    func(*tree_copy(example_inputs, detach=True),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 51, in wrapper
    traced_m, call_helper = trace_with_kwargs(
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 25, in trace_with_kwargs
    traced_module = better_trace(TraceablePosArgOnlyModuleWrapper(func),
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/utils.py", line 35, in better_trace
    script_module = torch.jit.trace(func, *args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 806, in trace
    return trace_module(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/jit/_trace.py", line 1074, in trace_module
    module._c._create_method_from_trace(
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 154, in forward
    outputs = self.module(*orig_args, **orig_kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/trace_helper.py", line 89, in forward
    return self.func(*args, **kwargs)
  File "/home/oppoer/.local/lib/python3.8/site-packages/diffusers/models/unets/unet_motion_model.py", line 834, in forward
    emb = emb.repeat_interleave(repeats=num_frames, dim=0)
  File "/home/oppoer/.local/lib/python3.8/site-packages/sfast/jit/overrides.py", line 21, in __torch_function__
    return func(*args, **kwargs)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

你看是不是你的输入源没有转到cuda上导致的。

model有转到GPU上:

    # adapter
    adapter = MotionAdapter.from_pretrained(adapter_model, torch_dtype=torch.float16)
    # pipline
    model = AnimateDiffPipeline.from_pretrained(pipeline_model, motion_adapter=adapter, torch_dtype=torch.float16)

    scheduler = DDIMScheduler.from_pretrained(
        pipeline_model,
        subfolder="scheduler",
        clip_sample=False,
        timestep_spacing="linspace",
        beta_schedule="linear",
        steps_offset=1,
    )
    model.scheduler = scheduler
    model.enable_vae_slicing()
    model.to(torch.device('cuda'))

我这里是否出现上面的报错和config.enable_jit = True有关

你可以试试,把config.enable_jit = True 干掉还是否出现,就能确定是否跟这个有关了。但是观察来看,感觉应该是输入源的问题,就是模型在cuda上,但是你的模型输入不在cuda

songh11 commented 3 weeks ago

@hihei 您好,请问您解决了吗,我遇到了同样的问题

没有解决

试一下把这个关了呢,enable_model_cpu_offload