LBL-EESA / TECA

TECA, theToolkit for Extreme Climate Analysis, contains a collection of climate anlysis algorithms targetted at extreme event detection and analysis.
Other
54 stars 21 forks source link

teca_binary_segmentation choose type-dependent high/low defaults at runtime #421

Open taobrienlbl opened 4 years ago

taobrienlbl commented 4 years ago

teca_binary_segmentation.cxx should be modified to issue a warning or an error in execute() if the low_threshold_value or high_threshold_value properties are out of bounds for the type of variable to which they are being applied. Otherwise, wrap-around issues can cause unexpected and hard-to-detect behavior.

I was just using teca_binary_segmentation on netCDF data that has the byte type. I naively set the high and low values to 1 and 1e20 in order to isolate gridcells in ARTMIP that have been flagged as ARs. This produced unexpected behavior. When I set the low/high values to 1 and 2 respectively, the segmentation behaved as expected. I believe this is caused by wrap-around when the high value was typecast to byte.

taobrienlbl commented 4 years ago

actually, I'm going to reclassify this as a medium priority bug; the default high value also produces the out-of-bounds behavior; it chooses the highest/lowest logical values for the double type. This means that the defaults won't currently work on non-float data. I think the defaults for teca_binary_segmentaiton.cxx may need to be modified to be type-sensitive. This would imply that the default values need to be chosen within execute().

burlen commented 4 years ago

sounds good. the assumption was that we'd only need to deal with floating point data, but clearly that was wrong.