Open timholy opened 1 year ago
I tried to find a reason but I couldn't, I don't particularly think it even plagues Gray as such but by principle, it might have been better as Gray in output. It's contained as long as people just use it on Gray n Bool. It would be great if you can chime in sir @zygmuntszpak ?
This issue was noticed before: https://github.com/JuliaImages/Images.jl/pull/897#discussion_r443175312. No resolution, apparently. I think we're going to release with it the way it is, but the deprecation warning will tell users to convert the output to eltype(img)
. That way, the new behavior is identical to the old behavior, and if we change it here to return that type, that conversion will be a no-op. So at least I think we're safe.
When I first worked on the ImageBinarization
package I deliberately wanted to separate the find_threshold
function out of ImageBinarization
into HistogramThresholding
because I imagined that the function would be more widely useful beyond the image processing context (the histogram could have come from anything). Hence, in the context of the HistogramThresholding
I wasn't thinking of the various ImageCore types and it didn't even occur to me at the time to check the type of the input array and return the same type as the input array when returning the threshold.
Revisiting the implementations, it looks like we always return an index from the edges
array as the threshold, where the edges
are computed as a result of partition_interval
which was defined as:
function partition_interval(nbins::Integer, minval::Real, maxval::Real)
return range(minval, step=(maxval - minval) / nbins, length=nbins)
end
So in a nutshell, this was the origin of returning a raw number and it wasn't specifically deliberate. Looking back at the conversation Tim linked to, I noticed that Johnny pointed out the key decision that needs to be made:
This difference raises a decision to be made: should we make the result of find_threshold always be eltype(img) type?
I feel it would be relatively good to restrict it as raw numbers (the status quo of HistogramThresholding) since it would always be a scalar in practice; there's no need to display it.
I don't mind either way. We could always convert the result to eltype(histogram)
which would then automatically give the expected result for images. So basically, as far as I can see we would just need to change:
function find_threshold(data::AbstractArray, f::AbstractThresholdAlgorithm ; nbins::Union{Int,Nothing} = 256)
edges, counts = build_histogram(data, nbins)
#=
The `counts` array stores at index 0 the frequencies that were below the
first bin edge. Since we are seeking a threshold over the interval
partitioned by `edges` we need to discard the first bin in `counts`
so that the dimensions of `edges` and `counts` match.
=#
return f(view(counts, 1:lastindex(counts)), edges)
end
to something like:
function find_threshold(data::AbstractArray, f::AbstractThresholdAlgorithm ; nbins::Union{Int,Nothing} = 256)
edges, counts = build_histogram(data, nbins)
#=
The `counts` array stores at index 0 the frequencies that were below the
first bin edge. Since we are seeking a threshold over the interval
partitioned by `edges` we need to discard the first bin in `counts`
so that the dimensions of `edges` and `counts` match.
=#
T = eltype(data)
return T.(f(view(counts, 1:lastindex(counts)), edges))
That works. The only case I can imagine a potential problem is for methods that might, e.g., give meaning to non-eltype values. For example, in the case where all the data points are either 0 or 1 (aka, Bool
), a threshold of 0.5 seems like it wouldn't be a crazy thing to return.
In any event, we made "safe" choices with the Images 0.26 release: it was backwards compatible, but that doesn't prevent us from changing it in the future. So I'd be comfortable with either outcome; we should just do whatever is best.
Is the choice of returning a
Float32
rather than aGray{N0f8}
deliberate?We can do either one, but this arises in the context of replacing
otsu_threshold
with this package in the Images 0.26 release; the implementation in Images.jl returns an element of the same type.We just need a decision of whether this is an oversight or deliberate. If it's deliberate, in the depwarn we can add an explicit
eltype(img)(find_threshold...)
.