The window_size in gradient-boosted equivalent sources currently defaults to 5 km. This would completely break for problems that have very large or very small areas. We used because we needed a default but this is not ideal.
A better default would be to estimate a square window where there will be about 5k data points on average. 5k data can fit on most computers RAM so it seems like a sensible default. Being conservative here means that we won't get memory errors from numpy in the majority of cases. In this case, the default would be window_size=None and in .fit we estimate a default value with:
if self.window_size is None:
area = (self.region_[1] - self.region_[0]) * (self.region_[3] - self.region_[2])
ndata = data.size
points_per_m2 = ndata / area
window_area = 5e3 / points_per_m2
self.window_size_ = np.sqrt(window_area)
else:
self.window_size_ = self.window_size
And we use self.window_size_ internally.
As with #424, I also think this is OK to break compatibility without going through the hassle of warning/deprecation. But will do it if others think it's needed.
Are you willing to help implement and maintain this feature?
Yes, but happy to let others do it since my time is limited.
Description of the desired feature:
The
window_size
in gradient-boosted equivalent sources currently defaults to 5 km. This would completely break for problems that have very large or very small areas. We used because we needed a default but this is not ideal.A better default would be to estimate a square window where there will be about 5k data points on average. 5k data can fit on most computers RAM so it seems like a sensible default. Being conservative here means that we won't get memory errors from numpy in the majority of cases. In this case, the default would be
window_size=None
and in.fit
we estimate a default value with:And we use
self.window_size_
internally.As with #424, I also think this is OK to break compatibility without going through the hassle of warning/deprecation. But will do it if others think it's needed.
Are you willing to help implement and maintain this feature?
Yes, but happy to let others do it since my time is limited.