akarshzingade / image-similarity-deep-ranking

370 stars 103 forks source link

Attempting to run in grayscale mode #13

Open cxfire16 opened 6 years ago

cxfire16 commented 6 years ago

Hi @akarshzingade, I've been trying to convert the model to run only in grayscale mode however I keep running across an error stating

ValueError: Error when checking input: expected input_2 to have shape (224, 224, 1) but got array with shape (224, 224, 3)

Here's what I've done so far:

The input images are in color, but I assume PIL converts it to grayscale thru img.convert('L')

I might have missed something? Thanks!

cxfire16 commented 6 years ago

Or I'm being dumb realizing VGG was built and compiled only for 3 channels? πŸ˜‚πŸ˜‚πŸ˜‚

cxfire16 commented 6 years ago

I guess I'll just resort to converting images to grayscale literally. Then fill in inputs as RGB; this way I have uniform inputs. Can't wait to see the outcome.

Leaving this thread open for comments and thoughts. Kindly close if desired. 😁

cxfire16 commented 6 years ago

Tweaked iterator to override load_img to load images as grayscale. Then verified changes by printing shape of loaded data. Reverted the rest of the changes back to original. Currently training at the moment 😁

cxfire16 commented 6 years ago

Training was completed. Although working, I'm not getting my desired results. I noticed color still was a heavy factor for prediction. currently looking for ways for the model to disregard the colors in the image

akarshzingade commented 6 years ago

So, the model is relying on the color too much?

cxfire16 commented 6 years ago

It seems so. I'm really going super slow at the moment. The sampling process easily exhausts my resources. I only have 16GB ram. I'm still researching the feasibility of VGG for texture recognition. If all else fail, I'll then open doors to changing the network to something very suitable with grayscale. Also, I'll have to deal with my resource constraints 😝 I've made a few mods to the code; instead of processing the files before writing to file, I made it write to file directly, so that it wont be consuming much RAM. This was later still ineffective as during the indexing, it would take as much resources then after indexing, it just freezes ❄️

cxfire16 commented 6 years ago

I also wish to ask what's the most ideal number of image per class? I have more than 10 classes. each having a mix of 40~50 images. Not quite enough for classification I guess given the number of classes.

akarshzingade commented 6 years ago

I think Inception is less sensitive to colour. You could try that.

I haven't found any article/paper that shows the colour sensitivity of VGG. The closest I have found is this: https://arxiv.org/pdf/1710.00756.pdf. They say: "Lowlevel features (e.g., relu1 1 layer in VGG19) are sensitive to color appearance and thus fail to match objects with semantic similarity but different colors, e.g., matching result of blue-to-dark sky image pair at the finest level (L = 1)"

akarshzingade commented 6 years ago

50 per class is fine I think. It's the number of triplets per query image that matters. I would say 50 triplets per query image and positive image pair.

cxfire16 commented 6 years ago

Interesting! I'll give that a shot after I exhaust myself on VGG. Thankfully Keras made things easier with the applied networks. Thanks Akarsh!

cxfire16 commented 6 years ago

When you say "50 triplets per query image" does that mean increasing the num_pos_images and num_neg_images in tripletSampler.py? such that parameters look like the following: python tripletSampler.py --input_directory data --output_directory output --num_pos_images 10 --num_neg_images 40

akarshzingade commented 6 years ago

using "--num_pos_images 10 --num_neg_images 40" will create 40*10 triplets per query. What I meant to say was 50 negative images per query image and positive image pair. But, this will create a lot of triplets based on your dataset. So, you would have to choose according to the resources available for you.

cxfire16 commented 6 years ago

That's noted. Thanks Akarsh! I'll be updating soon.

akarshzingade commented 6 years ago

:)

akarshzingade commented 6 years ago

Any interesting updates, cxfire16? :)

longzeyilang commented 6 years ago

@akarshzingade @cxfire16 From my experiment, the result of this model is sensitive to colour. I want to do object shape similarity regardless of different color. Your request is very similar to mine, You have solved this problem?

IAmAbdusKhan commented 6 years ago

@akarshzingade @cxfire16 @longzeyilang i have implemented the model on street2shop dataset . Yes from my experience too the model is very sensitive to colour and less selective to shapes.

cxfire16 commented 6 years ago

Hey guys! I believe I have exhausted the stock configuration matched with various data preprocessing steps like image thresholding, grayscaling, and etc.. It seems that we are in the same path about wanting to weigh other attributes not just the color. My next step would be to attempt to change the network used so instead of using VGG, I'd use inception perhaps. but I'll have to do further research on color sensitivities of other networks. Nevertheless, we'd never know unless we try things. Thanks! I'll keep posted and will be updating soon.

akarshzingade commented 6 years ago

@cxfire16 @IAmAbdusKhan There are two things that @longzeyilang pointed out in another issue. I have missed out taking the max of loss and 0 for each triplet. and also, this code doesn't include the squared L2 regularisation ( I think it is squared L2, but need to confirm) as mentioned in the paper. You could try that.

cxfire16 commented 6 years ago

Is inceptionV3 more resource hungry compared to VGG16? I immediately get OOMs when attempting to train. I ended up reducing my batch size to just 1 because reducing it in half would yield the old error we encountered before; IndexError: index n is out of bounds for axis y with size z

Nevertheless it worked and I was able to train but for quite a longer duration compared to having larger batch size. Interestingly, results are still the same: it is still color sensitive 😬

IAmAbdusKhan commented 6 years ago

@cxfire Yes inception is a much deeper network as compared to VGG16 so using it should be more resource intensive. Btw shouldn't the batch size be always a multiple of 3 as per the implementation for proper learning .

On Tue, Jul 10, 2018 at 11:04 AM, cxfire16 notifications@github.com wrote:

Is inceptionV3 more resource hungry compared to VGG16? I immediately get OOMs when attempting to train. I ended up reducing my batch size to just 1 because reducing it in half would yield the old error we encountered before; IndexError: index n is out of bounds for axis y with size z

Nevertheless it worked and I was able to train but for quite a longer duration compared to having larger batch size. Interestingly, results are still the same: it is still color sensitive 😬

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarshzingade/image-similarity-deep-ranking/issues/13#issuecomment-403705680, or mute the thread https://github.com/notifications/unsubscribe-auth/AiDV13V2pKeFMdg06nDwR-ptEtAbW55gks5uFDz3gaJpZM4Us3Py .

cxfire16 commented 6 years ago

@IAmAbdusKhan yes, it's always on multiples of 3 due to the code being

batch_size = 1 then batch_size *= 3

cxfire16 commented 6 years ago

I've parked this problem for now as I wish to try storing the means for faster image search. I'm leaving this thread open for discussions for as long as @akarshzingade allows it so. I might be opening new ones regarding my path with creating a light search function. Also, I'd soon be forking the repo for my implementation of the file first method of loading images for lower spec computers. Might as well do PRs if I find something worth suggesting. Thanks everyone!