I’m new to this. Had fun experimenting with image-to-text today but found the gpt2 model was not great. Can we add support for the Salesforce BLIP model somehow?
I’ll look into exporting it to ONNX myself but I didn’t see it on the list for the second checkbox under prerequisites 🙃.
If I’m off base - any recommendations for how to improve the image-to-text pipeline?
Prerequisites
[X] The model is supported in Transformers (i.e., listed here)
[ ] The model can be exported to ONNX with Optimum (i.e., listed here)
Additional information
No response
Your contribution
I can submit a PR but I might need a little guidance on the model training piece!
Model description
I’m new to this. Had fun experimenting with image-to-text today but found the gpt2 model was not great. Can we add support for the Salesforce BLIP model somehow?
I’ll look into exporting it to ONNX myself but I didn’t see it on the list for the second checkbox under prerequisites 🙃.
If I’m off base - any recommendations for how to improve the image-to-text pipeline?
Prerequisites
Additional information
No response
Your contribution
I can submit a PR but I might need a little guidance on the model training piece!
thank you!