huggingface / optimum-quanto

A pytorch quantization backend for optimum
Apache License 2.0
833 stars 62 forks source link

task not support: image-text-to-text. #352

Open fangyangci opened 2 weeks ago

fangyangci commented 2 weeks ago

when I run command: optimum-cli export onnx --model meta-llama/Llama-3.2-11B-Vision-Instruct ./

shows error: KeyError: "Unknown task: image-text-to-text. Possible values are: audio-classification for AutoModelForAudioClassification, audio-frame-classification for AutoModelForAudioFrameClassification, audio-xvector for AutoModelForAudioXVector, automatic-speech-recognition for ('AutoModelForSpeechSeq2Seq', 'AutoModelForCTC'), depth-estimation for AutoModelForDepthEstimation, feature-extraction for AutoModel, fill-mask for AutoModelForMaskedLM, image-classification for AutoModelForImageClassification, image-segmentation for ('AutoModelForImageSegmentation', 'AutoModelForSemanticSegmentation'), image-to-image for AutoModelForImageToImage, image-to-text for AutoModelForVision2Seq, mask-generation for AutoModel, masked-im for AutoModelForMaskedImageModeling, multiple-choice for AutoModelForMultipleChoice, object-detection for AutoModelForObjectDetection, question-answering for AutoModelForQuestionAnswering, semantic-segmentation for AutoModelForSemanticSegmentation, text-to-audio for ('AutoModelForTextToSpectrogram', 'AutoModelForTextToWaveform'), text-generation for AutoModelForCausalLM, text2text-generation for AutoModelForSeq2SeqLM, text-classification for AutoModelForSequenceClassification, token-classification for AutoModelForTokenClassification, zero-shot-image-classification for AutoModelForZeroShotImageClassification, zero-shot-object-detection for AutoModelForZeroShotObjectDetection"

Is this model supported by any plans? Or is there any way I can modify the code to support new tasks?