siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.
https://github.com/siliconflow/onediff/wiki
Apache License 2.0
1.72k stars 107 forks source link

[Bug] 针对使用peft微调后的unet编译后生图错误 #1149

Open hezeli123 opened 2 days ago

hezeli123 commented 2 days ago

Your current environment information

libibverbs not available, ibv_fork_init skipped Collecting environment information... PyTorch version: 2.4.0+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build PyTorch: N/A

OneFlow version: path: ['/opt/py3/lib/python3.10/site-packages/oneflow'], version: 0.9.1.dev20241121+cu122, git_commit: cbb0a3e, cmake_build_type: Release, rdma: True, mlir: True, enterprise: False Nexfort version: none OneDiff version: 1.2.1.dev28 OneDiffX version: 1.2.1.dev28+g424c81a8

OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: Could not collect CMake version: version 3.30.4 Libc version: glibc-2.35

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA A10 Nvidia driver version: 470.82.01 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-3 Off-line CPU(s) list: 4-15 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz CPU family: 6 Model: 106 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 6 BogoMIPS: 5800.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 arat avx512vbmi avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq spec_ctrl intel_stibp arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 384 KiB (8 instances) L1i cache: 256 KiB (8 instances) L2 cache: 10 MiB (8 instances) L3 cache: 48 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; Load fences, __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB

Versions of relevant libraries: [pip3] diffusers==0.31.0 [pip3] numpy==1.26.4 [pip3] onnx==1.16.1 [pip3] onnxruntime==1.20.0 [pip3] torch==2.4.0+cu124 [pip3] torchao==0.6.1 [pip3] torchaudio==2.4.0+cu124 [pip3] torchvision==0.19.0+cu124 [pip3] transformers==4.27.1 [pip3] transformers-stream-generator==0.0.5 [pip3] triton==3.0.0 [pip3] tritonclient==2.50.0 [conda] Could not collect

🐛 Describe the bug

针对使用peft的unet进行编译会生图错误 unet = get_peft_model(unet, lora_config) unet.load_adapter(LCM_path, adapter_name="default") unet.merge_and_unload() from onediff.infer_compiler import oneflow_compile unet = oneflow_compile(unet)

hezeli123 commented 2 days ago

使用oneflow_compile编译后,生成的图片全黑,使用torch.compile(unet)没有问题,生成的图片正常。