This repository contains code for generating captions for images using a Transformer-based model. The model used is the `VisionEncoderDecoderModel` from the Hugging Face Transformers library, specifically the `nlpconnect/vit-gpt2-image-captioning` model.
We need to check if this PR satisfyhttps://sap-photo-cap.streamlit.app/ and fix it or not from locally. After that anyone from the link can use this project!!
We need to check if this PR
satisfy
https://sap-photo-cap.streamlit.app/ and fix it or not from locally. After that anyone from the link can use this project!!