This repository contains code for generating captions for images using a Transformer-based model. The model used is the VisionEncoderDecoderModel
from the Hugging Face Transformers library, specifically the nlpconnect/vit-gpt2-image-captioning
model.
To run this code, you'll need to install the following packages:
Certainly! Here's an updated installation section with more detailed instructions:
First, clone this repository to your local machine using Git:
git clone https://github.com/your-username/Image_Caption_Generator_With_Transformers.git
Replace your-username
with your GitHub username.
Navigate to the cloned repository and install the required Python packages using pip:
cd Image_Caption_Generator_With_Transformers
pip install -r requirements.txt
Download the pre-trained nlpconnect/vit-gpt2-image-captioning
model and tokenizer from the Hugging Face model hub using the transformers
library:
python download_model.py
This will download the necessary model files and save them to the models
directory.
To verify that the installation was successful, you can run the provided example usage code:
python example_usage.py
This will generate captions for a sample image (sample.jpg
) and print the captions to the console.
from transformers import VisionEncoderDecoderModel, ViTFeatureExtractor, AutoTokenizer
import torch
from PIL import Image
predict_step
function with a list of image paths to generate captions:captions = predict_step(['sample2.jpg'])
print(captions)
This will output the generated captions for the given image(s).
The provided code includes an example usage:
predict_step(['sample2.jpg'])
Make sure your Streamlit app (app.py
) is ready for deployment. Ensure that it includes all necessary dependencies and functionality.
Create a requirements.txt
file in your project directory listing all the dependencies needed by your Streamlit app. You can generate this file using pip freeze > requirements.txt
if you're using a virtual environment.
If you haven't already, set up a GitHub repository for your Streamlit app. Push your app.py
and requirements.txt
files to this repository.
app.py
).Streamlit will start building and deploying your app. You can monitor the deployment process in the Streamlit Sharing dashboard.
Once deployed, you can access your Streamlit app using the provided URL. Share this URL with others to showcase your app.
If you make changes to your app, simply push the changes to your GitHub repository. Streamlit Sharing will automatically redeploy your app with the new changes.
You can manage your deployed app in the Streamlit Sharing dashboard. From here, you can view logs, change settings, and monitor usage.
# Use the official Python image as the base image
FROM python:3.9
# Set the working directory in the container
WORKDIR /app
# Copy the requirements file
COPY requirements.txt .
# Install the required packages
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application code
COPY . .
# Expose the port for the Streamlit app (default is 8501)
EXPOSE 8501
# Run the Streamlit app
CMD ["streamlit", "run", "app.py"]
Here's what the different parts of the Dockerfile do:
FROM python:3.12
: This line specifies the base image for your Docker container. In this case, we're using the official Python 3.12 image.
WORKDIR /app
: This sets the working directory inside the container to /app
.
COPY requirements.txt .
: This copies the requirements.txt
file from your local machine to the container's working directory.
RUN pip install --no-cache-dir -r requirements.txt
: This line installs the Python packages listed in the requirements.txt
file.
COPY . .
: This copies the entire contents of your local project directory (including the Streamlit app code) to the container's working directory.
EXPOSE 8501
: This exposes port 8501 in the container, which is the default port that Streamlit runs on.
CMD ["streamlit", "run", "app.py"]
: This is the command that will be executed when the container starts. It runs the streamlit run app.py
command to start the Streamlit app.
To build the Docker image, navigate to the directory containing the Dockerfile and run the following command:
docker build -t image-caption-generator .
This will build a Docker image with the tag image-caption-generator
.
To run the container and start the Streamlit app, use the following command:
docker run -p 8501:8501 image-caption-generator
This command maps the container's port 8501 to the host's port 8501, so you can access the Streamlit app in your web browser at http://localhost:8501
.
Make sure to replace app.py
with the name of your Streamlit app file if it's different.
With this Dockerfile, you can easily build and run your Streamlit image caption generator app in a Docker container, ensuring a consistent and isolated environment for your application.