Resampling/decimating high resolution surfaces

banesullivan commented 5 years ago

Description

I have a very high resolution surface mesh created from a large scale lidar scan which I'd like to decimate to make dealing with easier. Right now, the mesh has 180748027 cells and 180989696 nodes. 180 million points/cells isn't a dataset larger than what I'm used to dealing with, however it is eating up resources when trying to render/process so I'd like to decimate it to something that can render much faster.

I currently have ~81 separate meshes of StructuredGrids in a MultiBlock dataset which renders a lot faster than a single, merged UnstructuredGrid, but I really I'd like one, lower-resolution PolyData object that can render easily along side other datasets.

I'm struggling to figure out how to decimate this large of a dataset - I merged all 81 grids into a single UnstructuredGrid and tried to run both decimate and decimate_pro which I let run for about an hour before killing the process. I've also tried to load and decimate each patch individually but the decimation filters again get hung and take a long time even on just a single patch which almost all them have 2253001 cells and 2256004.

So I have one MultiBlock dataset contains 81 StructuredGrid meshes each with about 2253001 cells and 2256004 points. I'd like to decimate/reduce all of these meshes and combine them to one single triangulated PolyData mesh which can be loaded/rendered really easily so that I can display it next to many other datasets for this scene. However, I don't want this to take hours to run...

Example Data

The dataset is ~1.3 GB so I cannot upload it here, but I could share it via Google Drive per request with this code to read it all into a MultiBlock dataset:

import pyvista as pv
import xarray as xr
import numpy as np
import glob
import os
# DO NOT USE PANEL WITH THIS BIG OF DATA
pv.rcParams['use_panel'] = False

def read_lidar_img(filename):
    """Read lidar point cloud from `.img` file to a single mesh

    Helpful: http://xarray.pydata.org/en/stable/auto_gallery/plot_rasterio.html
    """
    # Read in the data
    data = xr.open_rasterio(filename)
    values = np.asarray(data)
    nans = values == data.nodatavals
    if np.any(nans):
        # values = np.ma.masked_where(nans, values)
        values[nans] = np.nan
    # Make a mesh
    xx, yy = np.meshgrid(data['x'], data['y'])
    zz = values.reshape(xx.shape)
    # zz = np.zeros_like(xx)
    mesh = pv.StructuredGrid(xx, yy, zz)
    mesh['lidar'] = values.ravel(order='F')
    return mesh

files = glob.glob(os.path.join('./2017DTM_lidar_GP_MSH_SpiritLake/', '*.img'))
print('Number of files {}'.format(len(files)))

all_surfaces = pv.MultiBlock()
for i,f in enumerate(files):
    name = os.path.basename(f).replace('.img', '')
    print('{}/{}: {}'.format(i, len(files), name), end='\r')
    surface = read_lidar_img(f)
    all_surfaces[-1, name] = surface
print('Done.')

all_surfaces.plot(notebook=False, multi_colors=True, eye_dome_lighting=True)

banesullivan commented 5 years ago

@daavoo, is this something PyntCloud could handle?

banesullivan commented 5 years ago

@akaszynski - have you ever needed to decimate large meshes like this? Any ideas on why the decimate and decimate_pro filters might be taking a very long time?

banesullivan commented 5 years ago

One solution I have discovered is to leave the meshes in 2D (z variation being none - there's a commented out portion in read_lidar_img to control this) with the elevation values as a scalar array then use the .sample filter onto another planar surface with the desired resolution.

...
# Merge into a single mesh - this takes ~1 minute to execute
geom = all_surfaces.extract_geometry()

grid = pv.create_grid(geom, (1000,1000,1))
low_res = grid.sample(geom)

# Address bug with resample filter for no-data points
low_res['lidar'][np.argwhere(foo['vtkValidPointMask'] == 0)] = np.nan
low_res['lidar'][low_res['lidar'] == -1.000e+38] = np.nan

# Make a 3D surface and render
surf = low_res.threshold().warp_by_scalar()
surf.plot(scalars='lidar', notebook=False, 
          cmap='gist_earth', enable_eye_dome_lightin=True)

Which renders in milliseconds with 596451 cells and 598810 points:

akaszynski commented 5 years ago

Sorry I didn’t get back to you earlier.

Send me the link via Slack and I’ll see if I can make it work with some of my personal tools.

akaszynski commented 5 years ago

Cheated a little bit since it was structured.

You'll have to change the indexing a little bit since there's some holes in the data now.


# this is new:
from tqdm import tqdm

import pyvista as pv
import xarray as xr
import numpy as np
import glob
import os
# DO NOT USE PANEL WITH THIS BIG OF DATA
pv.rcParams['use_panel'] = False

# Simplification factor
sfac = 10
assert type(sfac) is int, 'sfac must be an int'

def read_lidar_img(filename):
    """Read lidar point cloud from `.img` file to a single mesh

    Helpful: http://xarray.pydata.org/en/stable/auto_gallery/plot_rasterio.html
    """
    # Read in the data
    data = xr.open_rasterio(filename)
    values = np.asarray(data)
    nans = values == data.nodatavals
    if np.any(nans):
        # values = np.ma.masked_where(nans, values)
        values[nans] = np.nan
    # Make a mesh
    xx, yy = np.meshgrid(data['x'], data['y'])
    zz = values.reshape(xx.shape)
    # zz = np.zeros_like(xx)
    mesh = pv.StructuredGrid(xx, yy, zz)
    mesh['lidar'] = values.ravel(order='F')
    return mesh

files = glob.glob(os.path.join('./2017DTM_lidar_GP_MSH_SpiritLake/', '*.img'))
print('Number of files {}'.format(len(files)))

# remeshed
remeshed_files = glob.glob(os.path.join('./remeshed/', '*.ply'))

all_surfaces = pv.MultiBlock()
for i,f in tqdm(enumerate(files)):
    name = os.path.basename(f).replace('.img', '')
    print('{}/{}: {}'.format(i, len(files), name), end='\r')
    surface = read_lidar_img(f)

    # could use pyacvd
    # import pyacvd
    # tri_surf = surface.extract_surface().tri_filter()
    # clus = pyacvd.Clustering(tri_surf)
    # clus.cluster(surface.n_points // sfac)
    # clus.create_mesh()
    # all_surfaces[-1, name] = clus.remesh
    # clus.remesh.save('./remeshed/%s.ply' % name)

    # or since it's already structured...
    surf = pv.StructuredGrid(surface.x[::sfac][:, ::sfac][:, :,::sfac],
                             surface.y[::sfac][:, ::sfac][:, :,::sfac],
                             surface.z[::sfac][:, ::sfac][:, :,::sfac])

    scalars = surface.point_arrays['lidar'].reshape(surface.x.shape)
    surf.point_arrays['lidar'] = scalars[::sfac][:, ::sfac][:, :,::sfac].ravel()

    all_surfaces[-1, name] = surf

print('Done.')

# all_surfaces.plot(notebook=False, multi_colors=True, eye_dome_lighting=True)
# all_surfaces.plot(notebook=False, multi_colors=False, eye_dome_lighting=True)
# combined = all_surfaces.combine()

banesullivan commented 5 years ago

This is really awesome, thanks @akaszynski! This approach should work well for this particular dataset and my use case

I have some lingering questions that might be useful to other users about repairing the holes in this mesh... I'll create a new issue with a more general example.

pyvista / pyvista-support

Resampling/decimating high resolution surfaces #17

Description

Example Data