PermafrostDiscoveryGateway / viz-3dtiles

PDG vizualization pipeline for 3D Tile processing
Apache License 2.0
5 stars 2 forks source link

Speed up point iteration process #4

Closed laurenwalker closed 2 years ago

laurenwalker commented 2 years ago

There are two times in the builder script where each point in the geometry is iterated over. I want to see if these can be combined into one loop or if they are necessary

mbjones commented 2 years ago

I think both of these can be eliminated. Here's some sample code.

In the first loop, the iteration is intended to mainly add a constant z dimension to the coordinate points in the polygons. That can be done as a set operation rather than a loop using a lambda function. Here's an example of how all of the points in polygon can be updated with a z value with one line of code on an example polygon named poly:

poly = box(*Point(1.23, 9.87).buffer(1).bounds)
polyz = shapely.ops.transform(lambda x, y: (x, y, 0.1), poly)

The second loop seems to be mainly to combine all of the Polygon records from the GeoDataFrame into a single multipolygon. This can also be done in a single line of code with a list comprehension. Also, you don't need to keep track of the bounds, because you can get that directly from the Polygon and MultiPolygon objects in a single call.

# Combine all of the individual Polygon rows into a MultiPolygon and extract the bbox
miltipolygon_z_4978 = MultiPolygon([gdf_4978.geometry[row_id] for row_id in range(gdf_4978.shape[0])])
box_degrees_4978 = miltipolygon_z_4978.bounds # note this does not have the z dimension, which might need to be added
laurenwalker commented 2 years ago

So I reworked the point iteration process but it didn't speed up the execution time at all and we still need two loops. Once for the original projection to add the z values and a second time to calculate the geometry triangles as the first step of the gltf conversion.

I implemented Matt's suggestion for the first loop:

polyz = shapely.ops.transform(lambda x, y: (x, y, 0.1), poly)

https://github.com/PermafrostDiscoveryGateway/viz-3dtiles/blob/d35dad0b34798880e55a014b665d13570f7c2662/build-3d-tile.py#L55-L63

But this actually slows things down because in the main branch, you can see that we are only adding the z values to the convex_hull ring of the polygon. So I added back the simplify() process on the develop branch.

https://github.com/PermafrostDiscoveryGateway/viz-3dtiles/blob/d35dad0b34798880e55a014b665d13570f7c2662/build-3d-tile.py#L52

The second loop seems to be mainly to combine all of the Polygon records from the GeoDataFrame into a single multipolygon. This can also be done in a single line of code with a list comprehension. Also, you don't need to keep track of the bounds, because you can get that directly from the Polygon and MultiPolygon objects in a single call.

The second loop does a lot more than that. It creates a Multipolygon for each Polygon and calculates the geometry triangles for that polygon. The entire dataset cannot be lumped into one Multipolygon of the gltf will be created as if those many polygons are all part of the same model/feature. Cesium will also read it as one model/feature. This was one of the primary issues I fixed on the main branch last week.

So all in all, I was not able to compact the code into one loop like I wanted to, and I didn't find any ways to speed up the run time.