i have had a little play with the preprocessor (see code block below) that gives the option of only masking with nan areas that are connected and large enough (pick a threshold, any threshold).
difference when plotting the DEM (values ) with/without identifying the connected groups
the result: fewer (spurious) line breaks in the final ridgeplot.
cheers,
from scipy.ndimage import label # requires additional import
def preprocess(
values=None,
water_ntile=10,
lake_flatness=2,
vertical_ratio=40
filter_groups=True,
sq_size=3,
grp_size=9,
):
if values is None:
values = self.get_elevation_data()
nan_vals = np.isnan(values)
values[nan_vals] = np.nanmin(values)
scaled_values = (values - np.min(values)) / (np.max(values) - np.min(values))
is_water = scaled_values < np.percentile(scaled_values, water_ntile)
is_lake = rank.gradient(img_as_ubyte(scaled_values),
square(sq_size)) < lake_flatness
masked = (nan_vals | is_water | is_lake)
# identify connected regions of nans / water / lake
if filter_groups:
structure = np.ones((3,3), dtype=int) # i.e. fully connected
labelled, _ = label(masked,
structure)
unique, counts = np.unique(labelled,
return_counts=True)
# only set to nan connected regions that
# are bigger than grp > size
connected = np.in1d(labelled,
unique[counts > grp_size]
).reshape(labelled.shape)
scaled_values[connected & masked] = np.nan
else:
scaled_values[masked] = np.nan
scaled_values = vertical_ratio * scaled_values[-1::-1] # switch north and south
return scaled_values
hi.
i have had a little play with the
preprocessor
(see code block below) that gives the option of only masking withnan
areas that are connected and large enough (pick a threshold, any threshold).difference when plotting the DEM (
values
) with/without identifying the connected groupsthe result: fewer (spurious) line breaks in the final ridgeplot.
cheers,