NVIDIA / modulus-sym

Framework providing pythonic APIs, algorithms and utilities to be used with Modulus core to physics inform model training as well as higher level abstraction for domain experts
https://developer.nvidia.com/modulus
Apache License 2.0
138 stars 56 forks source link

🚀[FEA]: Weighted areas for boundary samples on Tesselated geometry. #80

Open domyam97 opened 8 months ago

domyam97 commented 8 months ago

Is this a new feature, an improvement, or a change to existing functionality?

Improvement

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem you would like to solve.

Currently, when boundary sampling tesselated geometry created from an .stl, all the sample points are assigned the same area value. This causes issues with Monte Carlo integration over these surfaces because the sample points are given the same weight regardless of the density of samples in the area around them. Even with quasirandom sampling this can be an issue. I implemented weighting the areas with a metric that describes how dense the sampling is around that point so that points in densely sampled areas have smaller area values.

After looking at the code for _sample() in modulus-sym/modulus/sym/geometry/tesselation.py I believe the way that the probabilites are set up for the points_per_triangle, with the law of large numbers you end up getting a number of points per triangle directly proportional to the area of the triangle. Which causes the area value for each sample to be the same across different triangles. At a smaller scale, the density within the triangle is not constant and that can cause issues when the physical values have a large gradient across the triangle (ie. Stagnation pressure at the front of a body in flow). So while the PINN may predict these values very well, the Monte Carlo integral will struggle to get a good approximation.

Describe any alternatives you have considered

I rescale the area values using NearestNeighbors search from sklearn.neighbors. Generally with PINN's we're sampling a lot of points with low dimensionality (dims <= 3), so the KDTree search is fairly efficient. The sample points are also generated as dicts of numpy arrays so doing this in the sampling method could be straightforward.

I get the distance to the k-1 nearest neighbors, because the sample point is always returned as the first nearest neighbor and take the mean distance. Then I use this distance as the diameter of a circular area around each sample point. This is a better representation of the region of influence of this sample than using the same area value for every point. However this area won't properly add up to the total area, so instead I divide this area by the mean area across all sample points and use it as a weight for the sampled area value.

geometry = Tessellation.from_stl(filename)
sample = geometry.sample_boundary(n_points)
points = np.array([sample['x'][:].flatten(),sample['y'][:].flatten(),sample['z'][:].flatten()]).T

nbrs = NearestNeighbors(n_neighbors=k, algorithm='kd_tree').fit(points)
distances, indices = nbrs.kneighbors(points)

mean_dist = np.mean(distances[:,1:], axis=1)

area_weights = np.pi*np.power(mean_dist,2)/4
area_weights = (area_weights/np.mean(area_weights)).reshape((points.shape[0],1))

sample['area'] = sample['area']*area_weights

Additional context

I encountered this issue when I was integrating the pressure on an aerodynamic surface and some areas have a higher density of samples, so the resulting drag and lift values have larger error than the learned pressure predictions.

This fix could be implemented in the _sample() function in modulus-sym/modulus/sym/geometry/tessellation.py or in the sample_boundary() function in modulus-sym/modulus/sym/geometry/geometry.py. I haven't used the other geometry classes to know if this is a general issue or specific to tesselated geometry. However I did have the same issue after scaling a tesselated geometry because scaling returns a generic Geometry object which changes the _sample() function.

Karl-JT commented 2 months ago

In #92 , I encountered the same problem when integrating pressure for drag and lift.