Segment faults - Githubissues

eoaksnes commented 2 years ago

Hi @ddemidov,

I'm getting Segment faults when trying to use mba3.

An example using the Python binding that causes segment fault:

import pytest
import numpy as np
import math
from mba import *

def test_mba3():
    rng = np.random.default_rng()
    x, y, z, d = rng.random((4, 6000))

    x_min = np.min(x)
    x_max = np.max(x)

    y_min = np.min(y)
    y_max = np.max(y)

    z_min = np.min(z)
    z_max = np.max(z)

    coo = np.column_stack((
        x, y, z
    ))

    interp = mba3(
                  lo=[x_min, y_min, z_min],
                  hi=[x_max, y_max, z_max],
                  grid=[3, 3, 3],
                  coo=coo,
                  val=d)

    w = interp([[0.3, 0.7, 1.0]])

Do you know why I'm getting the segment fault errors? E.g. are the mba3 parameters wrong?

ddemidov commented 2 years ago

MBA expects the points to be strictly within the specified boundaries, so you should add some margins:

    m = 0.01
    x_min = np.min(x) - m
    x_max = np.max(x) + m

    y_min = np.min(y) - m
    y_max = np.max(y) + m

    z_min = np.min(z) - m
    z_max = np.max(z) + m

eoaksnes commented 2 years ago

Thanks, it seemed to work with using margins. However, when I increased the number of points to around 9000 I get segment fault again. Is there any max number of points that can be used?

ddemidov commented 2 years ago

I can not reproduce this: https://gist.github.com/ddemidov/8b18bdc18aa057eaeb37abeab71eb371

eoaksnes commented 2 years ago

Thanks again, I got it to work :)

I'm trying to use the mba3 on the dataset of size 500x500x1000 points, and it took a bit too much time (around 20 minutes) for the calculations that I'm trying to do. I'm using the default settings for max_levels, tol and min_fill. I see that the code is using OpenMP to speed up things. Do you have an idea if it's possible to speed it up even more by applying more parallelism to the code? If not I might need to look at other methods :)

ddemidov commented 2 years ago

This probably is reasonable for this amount of data, just make sure you have compiled the library with the full optimization (-O3 -DNDEBUG compiler flags). The algorithm is spatially local, so you could in principle split the dataset into (slightly overlapping) subdomains and process those independently (either in parallel or in the pipeline).

If you have a fast GPU, and if most of the time is spent on the interpolation (as opposed to the setup), there is also the MBA algorithm implementation in the vexcl library:

https://vexcl.readthedocs.io/en/latest/expressions.html#scattered-data-interpolation-with-multilevel-b-splines

You will have to switch to C++ though in order to use it. Also, that implementation does not include sparse layers, so it may potentially require more memory. The setup uses CPU there, so if the setup is your bottleneck, then it won't be of much help to you.

eoaksnes commented 2 years ago

Thanks for the info. The vexcl library looks interesting :) I will run some tests and find out if it is the setup or interpolation that takes most of the time.

I guess the "randomness" of points is the tricky part if you want to split the dataset into subdomains and process those independently. To gain speed I guess have to go through all points quickly and somehow divide those points into several datasets.

eoaksnes commented 2 years ago

Thanks for the help! :) Got it to work.

ddemidov / mba

Segment faults #21