Neuroglycerin / neukrill-net-tools

Tools coded as part of the NDSB competition.
MIT License
0 stars 0 forks source link

Stereopsis/redshift transformation #117

Closed gngdb closed 9 years ago

gngdb commented 9 years ago

In these images, size matters because they're taken at a constant scale. We also know that it's possible to use just the size of the images to improve your score. Currently, the network resizes everything to the same size and loses any information about the original size of the image. We considered keeping the sizes the same with the shapefix transformation; padding the image into the space of larger images. That's not ideal because it increases the size of our images unnecessarily (most of the images are small).

To keep the images small, and still maintain information about size we could leverage another channel. One way we could do this by appealing to the idea of how people look at things with stereopsis; supply two channels with distortions based on the different viewpoints of each eye dependent on image size.

An even easier way to code it in would be by appealing to redshift (I know it's based on speed, but why not distance?), and scaling the colour of the images between blue and red (just use two channels) depending on how large the image was before resizing.

Of course, we'd probably also do well by placing extra information in later fully connected layers; but how to implement that is still pending.

scottclowe commented 9 years ago

Since we have grey-scale images, wouldn't redshifting them be a 6-fold increase in the size of the input space?

Maybe we could do a "wavelet" style thing where we use shapefix with different zoom factors? a channel with image at 1:1, and another at 2:1 or 4:1?

Or input the resized version and a 4:1 scaled shapefixed version?

gngdb commented 9 years ago

6-fold? Isn't it just two fold? Don't need to actually make them RGB, just change the 48 by 48 by 1 tensor to a 48 by 48 by 2 tensor. Only doubles the input space.

Yeah, two channels, one resize and one shapefixed might also work.

scottclowe commented 9 years ago

You can't redshift something which is grey. It has to be RGB values to be redder/bluer, surely?

The parallax I get - one is a translation of the other in the 2nd dimension [dimension 1 in python]. But I don't see how you envisage the two images in the 48x48x2 tensor will differ when one is redshifted.

gngdb commented 9 years ago

It's like pseudo-redshift, move between two channels (call them blue and red, but it doesn't matter) depending on the original size (smallest image is all in one channel, largest is all in the second).

On 5 March 2015 at 15:12, scottclowe notifications@github.com wrote:

You can't redshift something which is grey. It has to be RGB values to be redder/bluer, surely?

The parallax I get - one is a translation of the other in the 2nd dimension [dimension 1 in python]. But I don't see how you envisage the two images in the 48x48x2 tensor will differ when one is redshifted.

— Reply to this email directly or view it on GitHub https://github.com/Neuroglycerin/neukrill-net-tools/issues/117#issuecomment-77380650 .

scottclowe commented 9 years ago

Ah ok, so the two channels are a linear decomposition of the image. x=(img.size-smallestsize)/biggestsize Ch1=img * (1-x) Ch2=img * x

That's nothing like physical redshift!

----- Reply message ----- From: "Gavin Gray" notifications@github.com To: "Neuroglycerin/neukrill-net-tools" neukrill-net-tools@noreply.github.com Cc: "scottclowe" scottlowe0@gmail.com Subject: [neukrill-net-tools] Stereopsis/redshift transformation (#117) Date: Thu, Mar 5, 2015 15:52

It's like pseudo-redshift, move between two channels (call them blue and red, but it doesn't matter) depending on the original size (smallest image is all in one channel, largest is all in the second).

On 5 March 2015 at 15:12, scottclowe notifications@github.com wrote:

You can't redshift something which is grey. It has to be RGB values to be redder/bluer, surely?

The parallax I get - one is a translation of the other in the 2nd dimension [dimension 1 in python]. But I don't see how you envisage the two images in the 48x48x2 tensor will differ when one is redshifted.

— Reply to this email directly or view it on GitHub https://github.com/Neuroglycerin/neukrill-net-tools/issues/117#issuecomment-77380650 .


Reply to this email directly or view it on GitHub: https://github.com/Neuroglycerin/neukrill-net-tools/issues/117#issuecomment-77388884

gngdb commented 9 years ago

This now exists in the ParallelDataset style models. We supply it two versions of the image into independent convolutional pipelines. One is resized, the other is cropped (bit more complicated than that, but essentially correct).