Identify a visual relationship in a given image
This is based on Kaggle Visual Relationship Track
This project is addendum to a larger work in liaison with others. However, the published code is entirely mine, and nothing shared in this repository breaches the sanctity of research. Any proposal discussed is also common public domain knowledge, and the actual model implementing it has been withheld.
Further, no model weights have been published so as to ensure no harm comes to this research.
Once research completes, with cognizance of the team, will publish the models/ weights as well, because Deep Learning Community growth happens exponentially when there is sharing of published research in public domain.
The training dataset was derived from Open Image Dataset v5 and contains 329 relationship triplets with 375k training samples. These include both human-object relationships (e.g. "woman playing guitar", "man holding microphone"), object-object relationships (e.g. "beer on table", "dog inside car"), and also considers object-attribute relationships (e.g."handbag is made of leather" and "bench is wooden").
The features of this dataset are as follows -
Following types of relationships can be inferred from any image -
Given the nature of training data, each relationship can be decomposed into following types
Subject Label
(Ls) -> Relation
(Ro) -> Object
(Lo) : Bag Pack at TableLabel
(Lo) -> is
-> Attribute
(Al) : Table is WoodenHence when compounding a description, following structure is achieved
((Ls) (Al1)) -> (Ro) -> ((Lo) (Al2))
e.g. Bag Pack made of Fabric at Table which is Transparent
or a simplified version
((Al1) (Ls)) -> (Ro) -> ((Al2) (Lo))
e.g. Transparent Bottle on Wooden Table
Three separate models have been proposed.
Out of the box object detection model (YOLOv3) is used, which has been retrained using transfer learning for 57 labels.
Features generated from labels, bounding boxes, Label ROIs are used to predict attribute
Features generated from 2 labels pair embeddings and bounding boxes are used to predict relation triplets.
This is a python project, and should run fine on version >= 3.
Install python 3.x
Create a virtual environment for python
pip3 install virtualenv
mkdir ~/.virtualenvs
pip3 install virtualenvwrapper
# Add following to bash_profile
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/local/bin/python3
export VIRTUALENVWRAPPER_VIRTUALENV=/usr/local/bin/virtualenv
source ~/.bash_profile
source /usr/local/bin/virtualenvwrapper.sh
workon
mkvirtualenv visual_relations
This setups up a new virtualenv called visual_relations.
Install the required libraries for this project
pip3 install -r requirements.txt
Install MongoDB and configure it in conf/config.yaml
Setup mongoDB correct URL in config.yaml/ or provide environment variables in .env for the url
In order to work on this further, use following -
git clone git@github.com:usriva2405/visual-relationship-detection-api.git
cd visual-relationship-detection-api/
There are 3 ways to run this directly (locally)
Use python to run controller directly
python app/controller/flask_controller.py
curl http://127.0.0.1:5002 # prints Welcome to Visual Relationship Prediction!
If the project has been setup, this prints Welcome to Visual Relationship Prediction! on console
Using WSGI Server for running app (without config)
You can also use following for running the app :
gunicorn -b localhost:5002 -w 1 app.controller.flask_controller:app
curl http://127.0.0.1:5002 # prints Welcome to Visual Relationship Prediction!
App would be accessible on http://127.0.0.1:8880
Using WSGI Server for running app (with config)
Use following for running the app :
gunicorn -c conf/gunicorn.conf.py --log-level=debug app.controller.flask_controller:app
gunicorn -c conf/heroku-gunicorn.conf.py --log-level=debug app.controller.flask_controller:app
curl http://127.0.0.1:5002 # prints Welcome to Visual Relationship Prediction!
App would be accessible on http://0.0.0.0:5002
For building the project run
docker build --no-cache -t visual-relationship:latest .
For deploying the project run
DEV
docker run -d -p 5002:5002 --name visual-relationship -e ENVIRONMENT_VAR=DEV visual-relationship:latest
hit localhost:5002 on browser to access the project
Optional : Have mongoDB running and accessible on the URL given in config.yaml
We can pass images as form-data (local folder uploads) for verification
URL localhost:5002/detectobjects
TYPE POST (form_data)
HEADER Content-Type : multipart/form-data
SAMPLE request key-value pairs
base_image : <<multipart form based image>>
We can also pass images as URLs (s3-bucket URLs) for verification
URL localhost:5002/detectobjectsjson
TYPE POST
HEADER Content-Type : "application/json"
SAMPLE request json
{
base_image : <<image_url>>
}
For deploying the project run
heroku container:login
heroku create visualrelation-api
heroku container:push web --app visualrelation-api
heroku open --app visualrelation-api
If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.