NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.95k stars 2.49k forks source link

Unable to export diar_msdd_telephonic (neural diarizer) model to onnx format #8765

Closed chidalgoRR closed 5 months ago

chidalgoRR commented 7 months ago

Describe the bug

I am trying to export the diarization MSDD telephonic model to onnx format but i get the following error:

File "/home/chidalgo/.local/lib/python3.10/site-packages/nemo/core/classes/exportable.py", line 113, in export out, descr, out_example = model._export( File "/home/chidalgo/.local/lib/python3.10/site-packages/nemo/core/classes/exportable.py", line 177, in _export input_example = self.input_module.input_example() File "/home/chidalgo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1695, in getattr raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") AttributeError: 'MyExportableModel' object has no attribute 'input_module'. Did you mean: 'output_module'?.

Steps/Code to reproduce bug

Here is the code i used.

import nemo from nemo.collections.asr.models import EncDecDiarLabelModel from nemo.core.classes import ModelPT, Exportable

class MyExportableModel(EncDecDiarLabelModel, ModelPT, Exportable): pass

sd_model = MyExportableModel.from_pretrained(model_name = "diar_msdd_telephonic")

sd_model.eval() sd_model.to('cuda') sd_model.export('sd_model.onnx')

Expected behavior

To get a file named sd_model.onnx with the model in onnx format

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

Additional context

i also tried with the export.py script but i get the following:

File "/home/chidalgo/.local/lib/python3.10/site-packages/nemo/core/classes/common.py", line 507, in from_config_dict raise e File "/home/chidalgo/.local/lib/python3.10/site-packages/nemo/core/classes/common.py", line 499, in from_config_dict instance = cls(cfg=config, trainer=trainer) TypeError: Can't instantiate abstract class ModelPT with abstract methods list_available_models, setup_training_data, setup_validation_data

thank you, i have seen other issues with the same bug but it has not been solved

chidalgoRR commented 6 months ago

I saw someone had the same issue with ecapa tdnn model way back and it has not been solved

nithinraok commented 6 months ago

It might not be possible to convert MSDD straightaway to onnx, as it depends on other models. @tango4j can you pls provide steps to achieve this.

chidalgoRR commented 6 months ago

Thank you for responding, it would be very helpful for me if this can be solved.

tango4j commented 6 months ago

NeMo neural diarization class (clustering+MSDD) is not ONNX exportable. This is because pytorch linear algebra library is not fully ONNX exportable. We are working on fully neural diarization and only after fully neural diarization is released, the diarization pipeline will be ONNX exportable.

chidalgoRR commented 6 months ago

Thank you for the quick response. Is there a way to export it to any other format that can be uploaded to triton?. I am trying to run a diarization pipeline on triton and i need to upload this model.

github-actions[bot] commented 5 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.