AdityaShaha / CBIR-Using-CDH

The project is an attempt to implement the paper Content Based Image Retrieval using Color Difference Histogram by Guang-Hai Liu et all. in Python
13 stars 8 forks source link

Image Retrieval using Color Difference Histogram

Implementation of Content Based Image Retrieval Process based on Color Difference Histogram described by Guang-Hai Liu et al. in the paper using Python. The project was done under the guidance of Prof. Naveen Kumar N

Color Difference Histogram

Color Difference Histogram (CDH) method counts the perceptually uniform color difference between two points under different backgrounds with regard to colors and edge orientations in L*a*b* colorspace because the visual perceptual differences between two colors in L*a*b* colorspace are related to a measure of Euclidean distance while R, G and B components are highly correlated, and therefore, chromatic information is not directly fit for use. CDH also takes into account the spatial layout without any image segmentation, learning processes or any clustering implementation.

Algorithm

The steps involved in the CDH are:

Canberra distance is taken as the distance measure over Euclidean or Manhattan distance, because the distances in each dimensions are squared before summation, placing great emphasis on features that are greatly dissimilar.

Color Difference Histogram algorithm can be considered as an improved Multi-Texton Histogram (MTH) because it considers the same neighboring colors and edge orientations as texton types and is not just limited to four special texton types.

Repository Structure

Dataset used

Dataset used for the project is Corel-10k dataset which contains 100 categories, and there are 10,000 images from diverse contents such as sunset, beach, flower, building, car, horses, mountains, fish, food, door, etc. Each category contains 100 images of size 192×128 or 128×192 in the JPEG format. The dataset can be downloaded from Corel-10K

Working

Sample Input:

input

The sample input is taken as per the algorithm described above, the processing is taken place. After which the top results which are similar to the query image are returned.

Sample Output:

output1 output2 output3 output4 output5 output6

The outputs are more or less similar except for one (the helicopter photo)which is an outlier but still it is having an accuracy of around about 83% for top 6 images.