GPlates / gplately

GPlately is a Python package to interrogate tectonic plate reconstructions.
https://gplates.github.io/gplately/
GNU General Public License v2.0
55 stars 12 forks source link

Raster reconstruction #15

Closed brmather closed 2 years ago

brmather commented 2 years ago

It would be good to have a raster reconstruction object that is highly efficient. @cpalfonso I have implemented a very basic effort by, converting rasters to points and using the in-built pygplates plate partitioner to reconstruct them,

https://github.com/GPlates/gplately/blob/56be2a2f51015901c9d59353a1d2e77624f92098/gplately/grids.py#L654-L663

But from what we have discussed this is very slow for large numbers of points. Can you make a contribution here?

brmather commented 2 years ago

Once this is done, @laurilano or @lauren-ilano will update the Notebook 6 example.

jcannon-gplates commented 2 years ago

I can see from the highlighted code above that each point is wrapped in a pygplates feature and then partitioned using the static polygons. And then these point features are reconstructed to get the reconstructed grid locations for the reconstructed raster.

If you want to speed this up you can use the point_in_polygons module in PlateTectonicTools (which uses a spatial tree to speed up the process of finding which polygons overlap which points that is beneficial when the points are roughly uniformly distributed). Just using that can make a fairly big difference in speed in this type of scenario. There's an example here which finds the plate ID associated with each point. Then you could directly use this plate ID with RotationModel.get_rotation(time, plate_id, from_time) that returns a FiniteRotation that can directly reconstruct the point as reconstructed_point = finite_rotation * point. For each point you can also get the static polygon time of appearance in addition to plate ID (to stop reconstructing a point prior to its associated static polygon's time of appearance). And the conversions to/from lat/lon and pygplates.PointOnSphere can be done as point = pygplates.PointOnSphere(lat, lon) and recon_lat, recon_lon = reconstructed_point.to_lat_lon() (noting that lat is before lon, the opposite of GMT, since it's been that way in GPlates since the dawn of time). A minor addition might be to group points with the same plate ID into a MultiPoinOnSphere in order to rotate them with one call reconstructed_multi_point = finite_rotation * multi_point and convert back to a lat/lon array in one call recon_lat_lon_array = reconstructed_multi_point.to_lat_lon_array() (but I'm not sure if this would speed things up much, if at all, and might just complicate things).

So it might be worth special-casing point reconstruction this different way just for reconstructing rasters to speed things up (perhaps as a private method in class Raster).

brmather commented 2 years ago

Thanks @jcannon-gplates, I think the point_in_polygons routine in PTT use a k-d tree for speedy lookups if memory serves me correctly. @cpalfonso mentioned he has developed a routine that grids the static polygons and reconstructs them back in time which is another way of looking at this. In any case we should take advantage of the grid structure to partition points correctly and quickly.

jcannon-gplates commented 2 years ago

Yeah it's essentially a k-d tree of the points (actually 8 separate quad trees in lat/lon space each covering 90x90 degrees where you can specify the tree depth). There's no tree for the polygons (so only searched in linear order) but they are sorted from largest to smallest area since a batch of points (ie, a quad tree node) is more likely to intersect the larger polygons. It's not the best approach but we needed something workable at the time. In future pyGPlates will have trees built into it.

I'm not quite sure what you mean by gridding the static polygons and reconstructing them back in time, but it sounds along the same lines, so looking forward to seeing what @cpalfonso comes up with.