siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.
https://github.com/siliconflow/onediff/wiki
Apache License 2.0
1.61k stars 99 forks source link

Kaggle Notebook installation #990

Closed lukiod closed 2 months ago

lukiod commented 3 months ago

Describe the bug

A clear and concise description of what the bug is.

      I have been trying to install oneflow and truing to use benchmark/text-to-image.py but as soon as i add nexfort or oneflow 
      compiler it crashes

Your environment

OS

     Kaggle Notebook

OneDiff git commit id

OneFlow version info if you have installed oneflow

Run python -m oneflow --doctor and paste it here.

path: ['/opt/conda/lib/python3.10/site-packages/oneflow']
version: 0.9.1.dev20240515+cu122
git_commit: ec7b682
cmake_build_type: Release
rdma: True
mlir: True
enterprise: False

How To Reproduce

Steps to reproduce the behavior(code or script):

The complete error message

While using nexfort

Unable to load nexfort.{extension} module. Is it compatible with your PyTorch installation?
Traceback (most recent call last):
  File "/kaggle/working/onediff/./onediff_diffusers_extensions/examples/text_to_image_sdxl.py", line 102, in <module>
    base = compile_pipe(
  File "/opt/conda/lib/python3.10/site-packages/onediffx/compilers/diffusion_pipeline_compiler.py", line 75, in compile_pipe
    pipe = convert_pipe_to_memory_format(
  File "/opt/conda/lib/python3.10/site-packages/onediffx/compilers/diffusion_pipeline_compiler.py", line 118, in convert_pipe_to_memory_format
    from nexfort.utils.attributes import multi_recursive_apply
  File "/opt/conda/lib/python3.10/site-packages/nexfort/__init__.py", line 20, in <module>
    exec(f"import nexfort.{extension} as {extension}")
  File "<string>", line 1, in <module>
ImportError: /opt/conda/lib/python3.10/site-packages/nexfort/_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104impl3cow23materialize_cow_storageERNS_11StorageImplE

Whiile using onediff

/kaggle/working/onediff
Traceback (most recent call last):
  File "/kaggle/working/onediff/./benchmarks/text_to_image.py", line 36, in <module>
    from diffusers.utils import load_image
  File "/opt/conda/lib/python3.10/site-packages/diffusers/__init__.py", line 38, in <module>
    from .models import (
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/__init__.py", line 35, in <module>
    from .controlnet_flax import FlaxControlNetModel
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/controlnet_flax.py", line 25, in <module>
    from .modeling_flax_utils import FlaxModelMixin
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/modeling_flax_utils.py", line 45, in <module>
    class FlaxModelMixin:
  File "/opt/conda/lib/python3.10/site-packages/diffusers/models/modeling_flax_utils.py", line 194, in FlaxModelMixin
    def init_weights(self, rng: jax.random.KeyArray) -> Dict:
  File "/opt/conda/lib/python3.10/site-packages/jax/_src/deprecations.py", line 54, in getattr
    raise AttributeError(f"module {module!r} has no attribute {name!r}")
AttributeError: module 'jax.random' has no attribute 'KeyArray'

Additional context

Add any other context about the problem here.

ccssu commented 2 months ago

Describe the bug

A clear and concise description of what the bug is.

@lukiod please try it https://www.kaggle.com/code/ccsufengwen/comfyui-instantid-onediff/notebook#Test-OneDiff-install

image
lukiod commented 2 months ago

@ccssu ``` Unable to load nexfort.{extension} module. Is it compatible with your PyTorch installation?

AssertionError Traceback (most recent call last) Cell In[6], line 6 4 import torch 5 from onediff.utils.import_utils import is_nexfort_available ----> 6 assert is_nexfort_available() == True 8 import onediff.infer_compiler as infer_compiler 10 class MyModule(torch.nn.Module):

AssertionError: ``` this is the error i am getting

lukiod commented 2 months ago

if possible can you guyz provide me thee installation steps on kaggle as kaggle cuda support for now is 12.4

strint commented 2 months ago

if possible can you guyz provide me thee installation steps on kaggle as kaggle cuda support for now is 12.4

onediff

https://github.com/siliconflow/onediff?tab=readme-ov-file#3-install-onediff

nexfort

https://github.com/siliconflow/onediff?tab=readme-ov-file#optional-install-nexfort https://github.com/siliconflow/onediff/tree/main/src/onediff/infer_compiler/backends/nexfort#onediff-nexfort-compiler-backendbeta-release

lukiod commented 2 months ago

@strint kaggle has cuda 12.4 cuda so is there issue with installing different version of cuda for pytorch ? and could you guyz please release a sample kaggle notebook for the installation stuff as installing it to kaggle is quite problematic and due to different cuda version and all

strint commented 2 months ago

installing different version of cuda for pytorch

Please use pytorch 2.3 + cuda 12.4/12.1

https://github.com/siliconflow/onediff/tree/main/src/onediff/infer_compiler/backends/nexfort#dependency

lukiod commented 2 months ago

@strint i tried running using nexfort it always lead me to an error


 Setting `clean_caption=True` requires the ftfy library but it was not found in your environment. Checkout the instructions on the
installation section: https://github.com/rspeer/python-ftfy/tree/master#installing and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.

Setting `clean_caption` to False...
/opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
[2024-07-13 03:00:36,149] [WARNING] [fx_passes.py:33:apply_fx_passes] Triton is not available, skipping all inductor passes
/opt/conda/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
Traceback (most recent call last):
  File "/kaggle/working/onediff/./benchmarks/text_to_image.py", line 421, in <module>
    main()
  File "/kaggle/working/onediff/./benchmarks/text_to_image.py", line 356, in main
    pipe(**get_kwarg_inputs())
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py", line 843, in __call__
    ) = self.encode_prompt(
  File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py", line 376, in encode_prompt
    prompt_embeds = self.text_encoder(text_input_ids.to(device), attention_mask=prompt_attention_mask)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/onediff/infer_compiler/backends/nexfort/deployable_module.py", line 22, in forward
    return self._deployable_module_model(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 786, in _convert_frame
    result = inner_convert(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 535, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1036, in transform_code_object
    transformations(instructions, code_options)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 165, in _fn
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 500, in transform
    tracer.run()
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2149, in run
    super().run()
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2268, in RETURN_VALUE
    self.output.compile_subgraph(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 991, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1168, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1241, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1222, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/opt/conda/lib/python3.10/site-packages/torch/__init__.py", line 1768, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "/opt/conda/lib/python3.10/site-packages/nexfort/dynamo/backends/nexfort.py", line 16, in nexfort
    return compile_fx(gm, example_inputs, mode=mode, options=options)
  File "src/nexfort/fx_compiler/fx_compiler.py", line 49, in nexfort.fx_compiler.fx_compiler.compile_fx
  File "src/nexfort/fx_compiler/fx_compiler.py", line 60, in nexfort.fx_compiler.fx_compiler.compile_fx_inner
  File "src/nexfort/fx_compiler/fx_compiler.py", line 61, in nexfort.fx_compiler.fx_compiler.compile_fx_inner
  File "src/nexfort/fx_compiler/fx_compiler.py", line 91, in nexfort.fx_compiler.fx_compiler.compile_fx_inner
  File "src/nexfort/fx_compiler/fx_compiler.py", line 91, in nexfort.fx_compiler.fx_compiler.compile_fx_inner
  File "src/nexfort/fx_compiler/fx_compiler.py", line 114, in nexfort.fx_compiler.fx_compiler.compile_fx_inner
  File "src/nexfort/fx_compiler/overrides.py", line 72, in nexfort.fx_compiler.overrides.with_override_torch_env.decorator.wrapper
  File "src/nexfort/fx_compiler/overrides.py", line 73, in nexfort.fx_compiler.overrides.with_override_torch_env.decorator.wrapper
  File "src/nexfort/fx_compiler/fx_compiler.py", line 319, in nexfort.fx_compiler.fx_compiler.fx_compile
  File "src/nexfort/fx_compiler/fx_compiler.py", line 327, in nexfort.fx_compiler.fx_compiler.fx_compile
  File "/opt/conda/lib/python3.10/site-packages/torch/__init__.py", line 1729, in __call__
    return compile_fx(model_, inputs_, config_patches=self.config)
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1330, in compile_fx
    return aot_autograd(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 58, in compiler_fn
    cg = aot_module_simplified(gm, example_inputs, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 903, in aot_module_simplified
    compiled_fn = create_aot_dispatcher_function(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 628, in create_aot_dispatcher_function
    compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 443, in aot_wrapper_dedupe
    return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 648, in aot_wrapper_synthetic_base
    return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
  File "/opt/conda/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 119, in aot_dispatch_base
    compiled_fw = compiler(fw_module, updated_flat_args)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1257, in fw_compiler_base
    return inner_compile(
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 83, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/debug.py", line 304, in inner
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 438, in compile_fx_inner
    compiled_graph = fx_codegen_and_compile(
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 714, in fx_codegen_and_compile
    compiled_fn = graph.compile_to_fn()
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1307, in compile_to_fn
    return self.compile_to_module().call
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1250, in compile_to_module
    self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1205, in codegen
    self.scheduler = Scheduler(self.buffers)
  File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 1267, in __init__
    self.nodes = [self.create_scheduler_node(n) for n in nodes]
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 1267, in <listcomp>
    self.nodes = [self.create_scheduler_node(n) for n in nodes]
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 1358, in create_scheduler_node
    return SchedulerNode(self, node)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 687, in __init__
    self._compute_attrs()
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 698, in _compute_attrs
    group_fn = self.scheduler.get_backend(self.node.get_device()).group_fn
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 2276, in get_backend
    self.backends[device] = self.create_backend(device)
  File "/opt/conda/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 2264, in create_backend
    raise RuntimeError(
torch._dynamo.exc.BackendCompilerFailed: backend='nexfort' raised:
RuntimeError: Found Tesla P100-PCIE-16GB which is too old to be supported by the triton GPU compiler, which is used as the backend. Triton only supports devices of CUDA Capability >= 7.0, but your device is of CUDA capability 6.0

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
``` ``` 
strint commented 2 months ago

Found Tesla P100-PCIE-16GB which is too old to be supported by the triton GPU compiler, which is used as the backend. Triton only supports devices of CUDA Capability >= 7.0, but your device is of CUDA capability 6.0

You seem successfully instal onediff and nexfort.

But, sadly you are using Tesla P100-PCIE-16GB. We only support there NV devices: https://github.com/siliconflow/onediff?tab=readme-ov-file#0-os-and-gpu-compatibility

lukiod commented 2 months ago

@strint as for the last does onediff support image to image and some sample code where i can test these

strint commented 2 months ago

@strint as for the last does onediff support image to image and some sample code where i can test these

import argparse
from PIL import Image

import torch

from onediffx import compile_pipe
from diffusers import StableDiffusionImg2ImgPipeline

prompt = "sea,beach,the waves crashed on the sand,blue sky whit white cloud"

def parse_args():
    parser = argparse.ArgumentParser(description="Simple demo of image generation.")
    parser.add_argument(
        "--model_id", type=str, default="stabilityai/stable-diffusion-2-1",
    )
    cmd_args = parser.parse_args()
    return cmd_args

args = parse_args()

pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    args.model_id, use_auth_token=True, revision="fp16", torch_dtype=torch.float16,
)

pipe = pipe.to("cuda")

options = '{"mode": "max-optimize:max-autotune:low-precision", "memory_format": "channels_last"}'
pipe = compile_pipe(pipe, backend="nexfort", options=options, fuse_qkv_projections=True)

img = Image.new("RGB", (512, 512), "#1f80f0")

with flow.autocast("cuda"):
    images = pipe(
        prompt, image=img, guidance_scale=10, num_inference_steps=100, output_type="np",
    ).images
    for i, image in enumerate(images):
        pipe.numpy_to_pil(image)[0].save(f"{prompt}-of-{i}.png")