aakashjhawar / dress-pattern-recognition-using-CNN

An image recognition model which is capable of identifying the pattern on a dress image
MIT License
18 stars 4 forks source link

Issues #3

Open kalashjindal opened 1 year ago

kalashjindal commented 1 year ago

Around 15k images are present in the data csv, but only about 10k images in total are used in the notebook. The model was trained as a binary problem, but the real problem is a multi-calss one. The only folder created in create dataset is dataset category, but how is dataset category test used in notebooks? Receiving an accuracy of over 95% but not using other metrics to demonstrate it statistically is not a good thing.

kalashjindal commented 1 year ago

Added multithreading for downloading the images much faster

import numpy as np import pandas as pd import requests import os import threading

dress_patterns_df = pd.read_csv('dress_patterns.csv') dress_patterns = dress_patterns_df.values

category

category = set(dress_patterns_df['category']) print(category)

create a folder dataset and nested folder of category

print(os.listdir()) os.mkdir('dataset_category')

for cat in category: print(cat) os.mkdir('dataset_category/'+cat)

print(os.listdir('dataset_category'))

def download_image(url, category, unit_id, i): try: r = requests.get(url, allow_redirects=True) open('dataset_category/'+category+'/'+str(unit_id)+'.jpg', 'wb').write(r.content) except: print('ERROR at: ', i)

save image in respective category folder.

threads = [] for i in range(len(dress_patterns)): if i%5 == 0: print(i, '/', len(dress_patterns)) pattern = dress_patterns[i] url = pattern[3] unit_id = pattern[0] category = pattern[1] thread = threading.Thread(target=download_image, args=(url, category, unit_id, i)) threads.append(thread) thread.start()

# limit the number of threads to 5
if len(threads) == 5:
    for thread in threads:
        thread.join()
    threads = []

wait for any remaining threads to complete

for thread in threads: thread.join()