Open vivarose opened 10 years ago
Are the absolute values important to you or could you normalize the values before passing it to feature?
Sure, I can normalize the values before passing it to feature, but is it really necessary? Is there anything weird that might be happening that I don't see?
It shouldn't be necessary to normalize. By default, invert=False
and so raw_image = 1 - raw_image
is never called. If you need to invert black and white on your images, do it yourself before you pass them to locate
. Never use invert=True
on unnormalized float images.
Doing so would not raise an error, but I am pretty sure the output would be nonsense. We could do more validation to catch this, but we have judged it not worth the cost in performance. This is a fairly unusual use case.
I am calling locate with invert=True with my unnormalized float images. I didn't realize it could cause a problem until I read the locate() source code because it seemed to be functioning correctly.
I actually can't think of a place where non-normalized or even negative grayscale values would cause a serious problem, because (a) so much of the math is linear, and (b) as far as I know, there are no hard-coded grayscale values in the source. I have been careless about that before and it never seemed to matter. @danielballan, is there a particular spot you had in mind?
Altogether, it's not a case we currently test for, or have given much thought. I think @danielballan is right that for most use cases, checking for normalized images would be a waste of time. That said, we already spend a lot of time computing a grayscale threshold as a percentile of all values. The cost of checking that all pixels are non-negative should be considerably smaller.
In other words, @vivarose, we're interested in anything you learn about the matter, should you choose to continue living dangerously :).
I would worry that the negative values would foul up the total mass or eccentricity measurements, but I am not sure any more.
The negative values might also get cleaned up by the band-pass method.
Yes, I second Nathan's comment: we are interested in what you learn, and we are open to making changes.
I was worried about the threshold in the bandpass and the uncertainty estimation, but after taking a closer look I think it is actually OK. The results for "signal" and "mass" could be a little confusing to interpret, but as long is preprocessing in on and the threshold
is positive (as by default) then the image seen by the location code will be nonnegative.
Lines 499 to 501 of feature.py are:
To avoid degrading performance, assume gamut is zero to one.
This section of the code is inverting an image so that the dark features will become bright features that the function locate() can then find.
My concern is that I am background-dividing my images, which means that yes, I am passing an image of unnormalized floats to locate(). So the assumption is not correct for my usage.
At the moment, I'm passing an image that ranges from about .000001 to about 3.7 to locate(), and there doesn't seem to be any problem with the output, so this isn't a major concern, but the assumption could lead to issues. Should there be a mention in the documentation about this assumption?