Open martinfleis opened 2 years ago
I think I got it. See #5
Personally, I wanted to focus on comparing the functions available in packages from a user's perspective, rather than writing the most efficient alternatives. I also think we should compare similar functions in terms of features ({sf}
as a reference?). I know it's possible to write efficient code using eg. {Rcpp}
, {GEOS}
and {data.table}
, but I think that's beyond the reach of the vast majority of users.
distance you are trying to get a NxN matrix with pairwise distance between all points (both ways?), right?
Exactly!
sample I truly don't understand what is this trying to do :D. Are you trying to get n random points that are within the polygon? Sort-of Monte Carlo simulation?
Not quite sort of Monte Carlo simulation. I think sampling points in polygons is a standard practice in GIS :P Later, the coordinates can be retrieved from these geometries, or they can be used to extract values from the raster. Please check out sf::st_sample() as a reference. Ideally, you would implement this as a function in {geopandas}
.
Personally, I wanted to focus on comparing the functions available in packages from a user's perspective, rather than writing the most efficient alternatives.
Yup, I've used only functions that are available. As you can see from the discussion on intersects
, there could be even faster options.
compare similar functions in terms of features ({sf} as a reference?)
As far as I know, the intersects
in sf
uses spatial index under the hood, that is why I opted to use it as well. But I understand if you ignore that solution :).
Ideally, you would implement this as a function in {geopandas}.
We don't have anything like this right now but the code I used in #5, replacing your custom loop, is likely quite close to how it would look like if we had it (I'll open an issue to add it in future).
As far as I know, the intersects in sf uses spatial index under the hood, that is why I opted to use it as well. But I understand if you ignore that solution :).
My mistake, in that case {geopandas}
should also use spatial indexes. Not sure if {terra}
works the same way, but I believe it does. Edit: {terra}
doesn't use spatial indexes.
By "compare similar functions in terms of features", I meant that the functions in {terra}
and {sf}
have more options (arguments), so I suspect there will be overhead (but probably negligible) due to conditions/transformations.
Hi,
I'll make a PR changing some of the geopandas benchmarks to more performant versions but before that I'd like to ask for some clarifications. I understand that the benchmarks are artificial but before I'll start coding I want to make sure I understand what the main goal is.
n
random points that are within the polygon? Sort-of Monte Carlo simulation?I think I understand the rest.