openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.78k stars 2.17k forks source link

Getting error in distil-whisper-asr.ipynb notebook in Quantize Distil-Whisper encoder and decoder models for GPU #24479

Open tarunmcom opened 3 months ago

tarunmcom commented 3 months ago

I am getting the following error when trying to run the distil-whisper-asr.ipynb notebook. The error occurs at the 'Quantize Distil-Whisper encoder and decoder models block. The error happens if the selected device is GPU (Arc770 or iGPU), no error if device is CPU. I am running on Windows


RuntimeError Traceback (most recent call last) Cell In[21], line 1 ----> 1 get_ipython().run_cell_magic('skip', 'not $to_quantize.value', '\nimport gc\nimport shutil\nimport nncf\n\nCALIBRATION_DATASET_SIZE = 50\nquantized_model_path = Path(f"{model_path}_quantized")\n\n\ndef quantize(ov_model: OVModelForSpeechSeq2Seq, calibration_dataset_size: int):\n if not quantized_model_path.exists():\n encoder_calibration_data, decoder_calibration_data = collect_calibration_dataset(\n ov_model, calibration_dataset_size\n )\n print("Quantizing encoder")\n quantized_encoder = nncf.quantize(\n ov_model.encoder.model,\n nncf.Dataset(encoder_calibration_data),\n subset_size=len(encoder_calibration_data),\n model_type=nncf.ModelType.TRANSFORMER,\n # Smooth Quant algorithm reduces activation quantization error; optimal alpha value was obtained through grid search\n advanced_parameters=nncf.AdvancedQuantizationParameters(smooth_quant_alpha=0.50)\n )\n ov.save_model(quantized_encoder, quantized_model_path / "openvino_encoder_model.xml")\n del quantized_encoder\n del encoder_calibration_data\n gc.collect()\n\n print("Quantizing decoder with past")\n quantized_decoder_with_past = nncf.quantize(\n ov_model.decoder_with_past.model,\n nncf.Dataset(decoder_calibration_data),\n subset_size=len(decoder_calibration_data),\n model_type=nncf.ModelType.TRANSFORMER,\n # Smooth Quant algorithm reduces activation quantization error; optimal alpha value was obtained through grid search\n advanced_parameters=nncf.AdvancedQuantizationParameters(smooth_quant_alpha=0.95)\n )\n ov.save_model(quantized_decoder_with_past, quantized_model_path / "openvino_decoder_with_past_model.xml")\n del quantized_decoder_with_past\n del decoder_calibration_data\n gc.collect()\n\n # Copy the config file and the first-step-decoder manually\n shutil.copy(model_path / "config.json", quantized_model_path / "config.json")\n shutil.copy(model_path / "openvino_decoder_model.xml", quantized_model_path / "openvino_decoder_model.xml")\n shutil.copy(model_path / "openvino_decoder_model.bin", quantized_model_path / "openvino_decoder_model.bin")\n\n quantized_ov_model = OVModelForSpeechSeq2Seq.from_pretrained(quantized_model_path, ov_config=ov_config, compile=False)\n quantized_ov_model.to(device.value)\n quantized_ov_model.compile()\n return quantized_ov_model\n\n\nov_quantized_model = quantize(ov_model, CALIBRATION_DATASET_SIZE)\n')

File ~\miniconda3\envs\openvino_env\lib\site-packages\IPython\core\interactiveshell.py:2541, in InteractiveShell.run_cell_magic(self, magic_name, line, cell) 2539 with self.builtin_trap: 2540 args = (magic_arg_s, cell) -> 2541 result = fn(*args, **kwargs) 2543 # The code below prevents the output from being displayed 2544 # when using magics with decorator @output_can_be_silenced 2545 # when the last Python token in the expression is a ';'. 2546 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~\Downloads\openvino_notebooks\notebooks\distil-whisper-asr\skip_kernel_extension.py:17, in skip(line, cell) 11 if eval(line): 13 return ---> 17 get_ipython().ex(cell)

File ~\miniconda3\envs\openvino_env\lib\site-packages\IPython\core\interactiveshell.py:2878, in InteractiveShell.ex(self, cmd) 2876 """Execute a normal python statement in user namespace.""" 2877 with self.builtin_trap: -> 2878 exec(cmd, self.user_global_ns, self.user_ns)

File :54

File :50, in quantize(ov_model, calibration_dataset_size)

File ~\miniconda3\envs\openvino_env\lib\site-packages\optimum\intel\openvino\modeling_seq2seq.py:461, in OVModelForSeq2SeqLM.compile(self) 460 def compile(self): --> 461 self.encoder._compile() 462 self.decoder._compile() 463 if self.use_cache:

File ~\miniconda3\envs\openvino_env\lib\site-packages\optimum\intel\openvino\modeling_seq2seq.py:523, in OVEncoder._compile(self) 521 if self.request is None: 522 logger.info(f"Compiling the encoder to {self._device} ...") --> 523 self.request = core.compile_model(self.model, self._device, ov_config) 524 # OPENVINO_LOG_LEVEL can be found in https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_AUTO_debugging.html 525 if "OPENVINO_LOG_LEVEL" in os.environ and int(os.environ["OPENVINO_LOG_LEVEL"]) > 2:

File ~\miniconda3\envs\openvino_env\lib\site-packages\openvino\runtime\ie_api.py:521, in Core.compile_model(self, model, device_name, config, weights) 516 if device_name is None: 517 return CompiledModel( 518 super().compile_model(model, {} if config is None else config), 519 ) 520 return CompiledModel( --> 521 super().compile_model(model, device_name, {} if config is None else config), 522 ) 523 else: 524 if device_name is None:

RuntimeError: Exception from src\inference\src\cpp\core.cpp:109: Exception from src\inference\src\dev\plugin.cpp:54: Exception from src\core\src\dimension.cpp:227: Cannot get length of dynamic dimension

RandyWilkinson commented 3 months ago

I am seeing the same error, any updates?