thunder-project / thunder

scalable analysis of images and time series
http://thunder-project.org
Apache License 2.0
814 stars 184 forks source link

Negative Value Errors for Images #316

Closed kr-hansen closed 8 years ago

kr-hansen commented 8 years ago

When using the images.minus() function, sometimes the values of some pixels may become negative.

To correct for this, I would like to shift the whole image by a scalar value (The minimum of the difference between the images). However, after doing the minus call, anytime I try to access the new image object, I get this error:

Traceback (most recent call last): File "", line 1, in File "build/bdist.linux-x86_64/egg/thunder/images/images.py", line 191, in map File "build/bdist.linux-x86_64/egg/thunder/base.py", line 460, in _map File "build/bdist.linux-x86_64/egg/bolt/spark/array.py", line 141, in map File "build/bdist.linux-x86_64/egg/bolt/spark/array.py", line 94, in _align TypeError: unsupported operand type(s) for -: 'int' and 'NoneType'

Thus, I'm not able to calculate the minimum value across the images to then adjust.

A current workaround is to convert the image object to an RDD, calculate the minimum, and adjust the minimum value as an rdd, then do td.images.fromrdd() to get back to an RDD.

boazmohar commented 8 years ago

@kkcthans any chance that your images object dtype is uint? Could you give a minimal example?

kr-hansen commented 8 years ago

@boazmohar Minimal example of images, or of use case?

My dtype is initially uint8, but get the negative values arise from doing a homomorphic filter.
I scale to between 0 and 1 which changes it to 'float64', take the natural log of the pixel intensities, use the gaussian_filter function, then subtract the result of the gaussian_filter frame by frame to remove the low frequency components of the image using images.minus(). I then intend to take these values to exp() and unscale/shift them and return them to the original image space/dtype.

However, after subtracting off the low frequency components, some values become negative and that image object essentially becomes frozen without me being able to do any map() or other function calls to it.

I did notice when I do images.minus() with dtype uint8 it loops back around as is the same with any ndarrays uint8 (from 9 to 245 or something from the subtraction). However, since I'm artificially scaling the values to take the natural log and it is no longer uint8 I'm getting this issue.

boazmohar commented 8 years ago

@kkcthans Sorry, I was not clear. I meant a minimal example of code that I could run and get the error. Does the same thing happen if you do:

import thunder as td
data1 = td.images.fromrand(engine=sc, seed=0)
data2 = td.images.fromrand(engine=sc, seed=1)
data3 = data1.minus(data2)
data3.first()

This works for me.

kr-hansen commented 8 years ago

@boazmohar Here is some code:

import thunder as td data1 = td.images.fromrandom(engine=sc, seed=0) data2 = td.images.fromrandom(engine=sc, seed=1) data3 = data1.minus(data2) data3.min() or data3.max() or basically any other function in Thunder/base.py

It doesn't occur in Local Mode, however.

boazmohar commented 8 years ago

It looks like the underling BoltArray has not been initialized properly, its split property is None. I will try to see where it happens.

boazmohar commented 8 years ago

@kkcthans That should do it.

kr-hansen commented 8 years ago

Cool. After that branch matches the commit I'll update and see if it fixed it for me.

freeman-lab commented 8 years ago

nice job figuring this out! merging the PR now, @kkcthans definitely post if there's still a problem