ijyliu / computer-vision-project

Using classical and neural image embeddings and finetuned end-to-end networks to achieve top-tier performance on a vehicle type classification task. Containerized and deployed model as a web app
https://cv-web-app-3m4f2rmfzq-uc.a.run.app/
0 stars 0 forks source link

computer-vision-project

Isaac Liu, Mayank Sethi, Gaurav Sharan Srivastava, Ashutosh Tiwari

This project tackles an image classification task derived from the Stanford Cars Dataset.

In Phase 1, we preprocessed and resized >8,000 images (blurring with convolutions when appropriate), and constructed classical (gradient, color, texture) and neural (convolutional neural net, vision transformer) embeddings (additionally visualized with tSNE). We then deployed 4 different classfiers (Logistic Regression, SVM, XGBoost, Random Forest) on several variations of the data with no or variable amounts of dimensionality reduction applied via PCA, and performed comprehensive grid search for hyperparameters with 5-fold cross validation. We were able to achieve >91% test accuracy with our best classifier (with further improvements possible with minimal cost to accuracy) - for details, see this report and presentation.

In Phase 2, Isaac finetuned a mid-sized ResNet convolutional neural network on the data in PyTorch (>93% accuracy) and deployed a web app with Flask, Docker, and Google Cloud - you can upload your own image.

Technologies (not exhaustive!)