huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.57k stars 469 forks source link

AttributeError: FLOAT8E4M3FN #1994

Open Huanghong2016 opened 2 months ago

Huanghong2016 commented 2 months ago

System Info

when I use 

pip install optimum[onnxruntime-gpu]==1.8.5

run this code 
from optimum.onnxruntime.configuration import OptimizationConfig    

have some bug
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\升级知识点\kaodian0912-1\test.py:2 in <module>                                                │
│                                                                                                  │
│   1 from optimum.onnxruntime import ORTModelForSeq2SeqLM                                         │
│ ❱ 2 from optimum.onnxruntime.configuration import OptimizationConfig                             │
│   3 from optimum.onnxruntime.optimization import ORTOptimizer                                    │
│   4                                                                                              │
│   5                                                                                              │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\optimum\onnxruntime\configu │
│ ration.py:27 in <module>                                                                         │
│                                                                                                  │
│    24 from packaging.version import Version, parse                                               │
│    25                                                                                            │
│    26 from onnxruntime import __version__ as ort_version                                         │
│ ❱  27 from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, Qua   │
│    28 from onnxruntime.quantization.calibrate import create_calibrator                           │
│    29 from onnxruntime.transformers.fusion_options import FusionOptions                          │
│    30                                                                                            │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\__ │
│ init__.py:1 in <module>                                                                          │
│                                                                                                  │
│ ❱  1 from .calibrate import (  # noqa: F401                                                      │
│    2 │   CalibraterBase,                                                                         │
│    3 │   CalibrationDataReader,                                                                  │
│    4 │   CalibrationMethod,                                                                      │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\ca │
│ librate.py:22 in <module>                                                                        │
│                                                                                                  │
│     19                                                                                           │
│     20 import onnxruntime                                                                        │
│     21                                                                                           │
│ ❱   22 from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution     │
│     23                                                                                           │
│     24                                                                                           │
│     25 def rel_entr(pk: np.ndarray, qk: np.ndarray) -> np.ndarray:                               │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\qu │
│ ant_utils.py:144 in <module>                                                                     │
│                                                                                                  │
│   141 │   onnx_proto.TensorProto.UINT8: numpy.dtype("uint8"),                                    │
│   142 │   onnx_proto.TensorProto.INT16: numpy.dtype("int16"),                                    │
│   143 │   onnx_proto.TensorProto.UINT16: numpy.dtype("uint16"),                                  │
│ ❱ 144 │   onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,                                     │
│   145 │   onnx_proto.TensorProto.INT4: int4,  # base_dtype is np.int8                            │
│   146 │   onnx_proto.TensorProto.UINT4: uint4,  # base_dtype is np.uint8                         │
│   147 }                                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: FLOAT8E4M3FN

if I  install onnx==1.14.1
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\升级知识点\kaodian0912-1\test.py:2 in <module>                                                │
│                                                                                                  │
│   1 from optimum.onnxruntime import ORTModelForSeq2SeqLM                                         │
│ ❱ 2 from optimum.onnxruntime.configuration import OptimizationConfig                             │
│   3 from optimum.onnxruntime.optimization import ORTOptimizer                                    │
│   4                                                                                              │
│   5                                                                                              │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\optimum\onnxruntime\configu │
│ ration.py:27 in <module>                                                                         │
│                                                                                                  │
│    24 from packaging.version import Version, parse                                               │
│    25                                                                                            │
│    26 from onnxruntime import __version__ as ort_version                                         │
│ ❱  27 from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, Qua   │
│    28 from onnxruntime.quantization.calibrate import create_calibrator                           │
│    29 from onnxruntime.transformers.fusion_options import FusionOptions                          │
│    30                                                                                            │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\__ │
│ init__.py:1 in <module>                                                                          │
│                                                                                                  │
│ ❱  1 from .calibrate import (  # noqa: F401                                                      │
│    2 │   CalibraterBase,                                                                         │
│    3 │   CalibrationDataReader,                                                                  │
│    4 │   CalibrationMethod,                                                                      │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\ca │
│ librate.py:22 in <module>                                                                        │
│                                                                                                  │
│     19                                                                                           │
│     20 import onnxruntime                                                                        │
│     21                                                                                           │
│ ❱   22 from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution     │
│     23                                                                                           │
│     24                                                                                           │
│     25 def rel_entr(pk: np.ndarray, qk: np.ndarray) -> np.ndarray:                               │
│                                                                                                  │
│ C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\onnxruntime\quantization\qu │
│ ant_utils.py:145 in <module>                                                                     │
│                                                                                                  │
│   142 │   onnx_proto.TensorProto.INT16: numpy.dtype("int16"),                                    │
│   143 │   onnx_proto.TensorProto.UINT16: numpy.dtype("uint16"),                                  │
│   144 │   onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,                                     │
│ ❱ 145 │   onnx_proto.TensorProto.INT4: int4,  # base_dtype is np.int8                            │
│   146 │   onnx_proto.TensorProto.UINT4: uint4,  # base_dtype is np.uint8                         │
│   147 }                                                                                          │
│   148                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: INT4

Who can help?

No response

Information

Tasks

Reproduction (minimal, reproducible, runnable)

pip install optimum[onnxruntime-gpu]==1.8.5

from optimum.onnxruntime import ORTModelForSeq2SeqLM from optimum.onnxruntime.configuration import OptimizationConfig from optimum.onnxruntime.optimization import ORTOptimizer

Expected behavior

just runing

IlyasMoutawwakil commented 2 months ago

hello the version you're pinning (1.8.5) is very old and seems to install an onnx and onnxruntime versions that are incompatible (onnxruntime code is calling a dtype that doesn't exist in onnx). Make sure you have compatible versions https://onnxruntime.ai/docs/reference/compatibility.html

Huanghong2016 commented 2 months ago

And now I have updated onnx ,onnxruntime and optimum
onnx 1.16.2 onnxruntime-gpu 1.18.0 optimum 1.21.4

bug also have bug Building prefix dict from the default dictionary ... Loading model from cache C:\Users\pc\AppData\Local\Temp\jieba.cache Loading model cost 0.408 seconds. Prefix dict has been built successfully. Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1586, in _get_module return importlib.import_module("." + module_name, self.name) File "C:\ProgramData\Anaconda3\lib\importlib__init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\onnx\graph_transformations.py", line 19, in import onnx File "C:\ProgramData\Anaconda3\lib\site-packages\onnx\init__.py", line 77, in from onnx.onnx_cpp2py_export import ONNX_ML ImportError: DLL load failed while importing onnx_cpp2py_export: 动态链接库(DLL)初始化例程失败。

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1586, in _get_module return importlib.import_module("." + module_name, self.name) File "C:\ProgramData\Anaconda3\lib\importlib__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\onnxruntime\modeling_seq2seq.py", line 45, in from ..exporters.onnx import main_export File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\exporters__init.py", line 16, in from .tasks import TasksManager # noqa File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\exporters\tasks.py", line 141, in class TasksManager: File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\exporters\tasks.py", line 299, in TasksManager "clip-text-model": supported_tasks_mapping( File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\exporters\tasks.py", line 113, in supported_tasks_mapping importlib.import_module(f"optimum.exporters.{backend}.model_configs"), config_cls_name File "C:\ProgramData\Anaconda3\lib\importlib\init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "C:\ProgramData\Anaconda3\lib\site-packages\optimum\exporters\onnx\model_configs.py", line 23, in from ...onnx import merge_decoders File "", line 1055, in _handle_fromlist File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1576, in getattr__ module = self._get_module(self._class_to_module[name]) File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1588, in _get_module raise RuntimeError( RuntimeError: Failed to import optimum.onnx.graph_transformations because of the following error (look up to see its traceback): DLL load failed while importing onnx_cpp2py_export: 动态链接库(DLL)初始化例程失败。

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\upgrade_knowledge\point\predict.py", line 6, in from optimum.onnxruntime import ORTModelForSeq2SeqLM File "", line 1055, in _handle_fromlist File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1576, in getattr module = self._get_module(self._class_to_module[name]) File "C:\ProgramData\Anaconda3\lib\site-packages\transformers\utils\import_utils.py", line 1588, in _get_module raise RuntimeError( RuntimeError: Failed to import optimum.onnxruntime.modeling_seq2seq because of the following error (look up to see its traceback): Failed to import optimum.onnx.graph_transformations because of the following error (look up to see its traceback): DLL load failed while importing onnx_cpp2py_export: 动态链接库(DLL)初始化例程失败。

IlyasMoutawwakil commented 2 months ago

N ow it's another problem that's related to your environment. Unfortunately the installation of onnxruntime-gpu on windows is not as straight forward as pip install onnxruntime-gpu https://onnxruntime.ai/docs/install/#install-onnx-runtime-ort