siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.
https://github.com/siliconflow/onediff/wiki
Apache License 2.0
1.4k stars 85 forks source link

Issue with using oneflow as well as with nexfort #984

Closed airwakz closed 5 hours ago

airwakz commented 5 days ago

Describe the bug

A clear and concise description of what the bug is.

image

Your environment

OS windows

OneDiff git commit id

OneFlow version info

Run python -m oneflow --doctor and paste it here.

Traceback (most recent call last): File "/opt/conda/lib/python3.10/runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/opt/conda/lib/python3.10/runpy.py", line 146, in _get_module_details return _get_module_details(pkg_main_name, error) File "/opt/conda/lib/python3.10/runpy.py", line 110, in _get_module_details import(pkg_name) File "/opt/conda/lib/python3.10/site-packages/oneflow/init.py", line 26, in import oneflow._oneflow_internal ImportError: /opt/conda/lib/python3.10/site-packages/oneflow/../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_5, version libnvJitLink.so.12

How To Reproduce

Steps to reproduce the behavior(code or script):

!python3 ./benchmarks/text_to_image.py \ --model SG161222/RealVisXL_V4.0 \ --scheduler none \ --steps 10 \ --prompt "street style, detailed, raw photo, woman, face, shot on CineStill 800T"\ --compiler-config '{"mode": "max-optimize:max-autotune:low-precision", "memory_format": "channels_last", "dynamic": true}' \ --run_multiple_resolutions 1

The complete error message

/kaggle/working/onediff 2024-06-26 08:24:53.452677: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-06-26 08:24:53.452738: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-06-26 08:24:53.454400: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered /opt/conda/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py:34: FutureWarning: Transformer2DModelOutput is deprecated and will be removed in version 1.0.0. Importing Transformer2DModelOutput from diffusers.models.transformer_2d is deprecated and this will be removed in a future version. Please use from diffusers.models.modeling_outputs import Transformer2DModelOutput, instead. deprecate("Transformer2DModelOutput", "1.0.0", deprecation_message) /opt/conda/lib/python3.10/site-packages/diffusers/models/vq_model.py:20: FutureWarning: VQEncoderOutput is deprecated and will be removed in version 0.31. Importing VQEncoderOutput from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQEncoderOutput, instead. deprecate("VQEncoderOutput", "0.31", deprecation_message) /opt/conda/lib/python3.10/site-packages/diffusers/models/vq_model.py:25: FutureWarning: VQModel is deprecated and will be removed in version 0.31. Importing VQModel from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQModel, instead. deprecate("VQModel", "0.31", deprecation_message) Loading pipeline components...: 100%|█████████████| 7/7 [00:15<00:00, 2.20s/it] Oneflow backend is now active... W20240626 08:25:22.085691 221 cuda_device_descriptor.cpp:91] The CUDA device 'Tesla P100-PCIE-16GB' with capability 60 is not compatible with the current OneFlow installation. The current program may throw a 'no kernel image is available for execution on the device' error or hang for a long time. Please reinstall OneFlow compiled with a newer version of CUDA. 2024-06-26 08:26:00.359536: F tensorflow/python/lib/core/py_exceptionregistry.cc:28] Check failed: singleton == nullptr PyExceptionRegistry::Init() already called Stack trace (most recent call last):

Aborted (Signal sent by tkill() 221 0)

Additional context

Add any other context about the problem here.

strint commented 5 hours ago

F tensorflow/python/lib/core/py_exceptionregistry.cc:28] Check failed: singleton == nullptr PyExceptionRegistry::Init() already called

This problem is in tensorflow. You can uninstall tensorflow and take a try.