when I run command:
optimum-cli export onnx --model meta-llama/Llama-3.2-11B-Vision-Instruct ./
shows error:
KeyError: "Unknown task: image-text-to-text. Possible values are: audio-classification for AutoModelForAudioClassification, audio-frame-classification for AutoModelForAudioFrameClassification, audio-xvector for AutoModelForAudioXVector, automatic-speech-recognition for ('AutoModelForSpeechSeq2Seq', 'AutoModelForCTC'), depth-estimation for AutoModelForDepthEstimation, feature-extraction for AutoModel, fill-mask for AutoModelForMaskedLM, image-classification for AutoModelForImageClassification, image-segmentation for ('AutoModelForImageSegmentation', 'AutoModelForSemanticSegmentation'), image-to-image for AutoModelForImageToImage, image-to-text for AutoModelForVision2Seq, mask-generation for AutoModel, masked-im for AutoModelForMaskedImageModeling, multiple-choice for AutoModelForMultipleChoice, object-detection for AutoModelForObjectDetection, question-answering for AutoModelForQuestionAnswering, semantic-segmentation for AutoModelForSemanticSegmentation, text-to-audio for ('AutoModelForTextToSpectrogram', 'AutoModelForTextToWaveform'), text-generation for AutoModelForCausalLM, text2text-generation for AutoModelForSeq2SeqLM, text-classification for AutoModelForSequenceClassification, token-classification for AutoModelForTokenClassification, zero-shot-image-classification for AutoModelForZeroShotImageClassification, zero-shot-object-detection for AutoModelForZeroShotObjectDetection"
Is this model supported by any plans? Or is there any way I can modify the code to support new tasks?
when I run command:
optimum-cli export onnx --model meta-llama/Llama-3.2-11B-Vision-Instruct ./
shows error: KeyError: "Unknown task: image-text-to-text. Possible values are:
audio-classification
for AutoModelForAudioClassification,audio-frame-classification
for AutoModelForAudioFrameClassification,audio-xvector
for AutoModelForAudioXVector,automatic-speech-recognition
for ('AutoModelForSpeechSeq2Seq', 'AutoModelForCTC'),depth-estimation
for AutoModelForDepthEstimation,feature-extraction
for AutoModel,fill-mask
for AutoModelForMaskedLM,image-classification
for AutoModelForImageClassification,image-segmentation
for ('AutoModelForImageSegmentation', 'AutoModelForSemanticSegmentation'),image-to-image
for AutoModelForImageToImage,image-to-text
for AutoModelForVision2Seq,mask-generation
for AutoModel,masked-im
for AutoModelForMaskedImageModeling,multiple-choice
for AutoModelForMultipleChoice,object-detection
for AutoModelForObjectDetection,question-answering
for AutoModelForQuestionAnswering,semantic-segmentation
for AutoModelForSemanticSegmentation,text-to-audio
for ('AutoModelForTextToSpectrogram', 'AutoModelForTextToWaveform'),text-generation
for AutoModelForCausalLM,text2text-generation
for AutoModelForSeq2SeqLM,text-classification
for AutoModelForSequenceClassification,token-classification
for AutoModelForTokenClassification,zero-shot-image-classification
for AutoModelForZeroShotImageClassification,zero-shot-object-detection
for AutoModelForZeroShotObjectDetection"Is this model supported by any plans? Or is there any way I can modify the code to support new tasks?