symforce-org / symforce

Fast symbolic computation, code generation, and nonlinear optimization for robotics
https://symforce.org
Apache License 2.0
1.44k stars 147 forks source link

Segfault in METIS #1

Closed aaron-skydio closed 2 years ago

aaron-skydio commented 2 years ago

segfault in metis for exactly five optimized variables :exploding_head:

best example I have currently, just dealing with two-vectors. the commented out jacobians/hessians look fine to me. any other chain length seems to work fine

import functools
import numpy as np

from symforce import geo
from symforce import sympy as sm
from symforce.opt.factor import Factor
from symforce.opt.optimizer import Optimizer
from symforce.values import Values

import sym

def between(x: geo.V2, y: geo.V2) -> geo.V2:
    return x - y

# if this is 4 or 6 or any other number I tried so far, it works
num_samples = 5
x_keys = [f"x{i}" for i in range(num_samples)]

values = Values()
for i in range(num_samples):
    values[x_keys[i]] = np.array([1, 2])
print(values)

factors = []

for i in range(num_samples - 1):
    factor = Factor(keys=[x_keys[i], x_keys[i + 1]], residual=between)
    factors.append(factor)

    # factor._generate_python_function(optimized_keys=factor.keys)
    # res, jac, hes, rhs = factor.generated_residual(values[x_keys[i]], values[x_keys[i + 1]])
    # print(jac)
    # import ipdb; ipdb.set_trace()

x_prior = geo.V2(1, 0)
prior_factor = Factor(
    keys=[x_keys[0]], name="prior", residual=functools.partial(between, y=x_prior)
)
factors.append(prior_factor)

# prior_factor._generate_python_function(prior_factor.keys)
# res, jac, hes, rhs = prior_factor.generated_residual(values[xs[0]])
# print(jac)

optimizer = Optimizer(factors=factors, optimized_keys=values.keys(), params=Optimizer.Params())

# segfault here in metis computing ordering
result = optimizer.optimize(values)

print(f"x_prior: {x_prior.T}")
print(result.initial_values)
print(result.optimized_values)
print(f"num iters: {len(result.iteration_stats)}")
print(f"final error: {result.error()}")

I added a print in SparseCholeskySolver<MatrixType, UpLo>::ComputePermutationMatrix, just before the failing line to ordering_(A_selfadjoint, invpermutation) to convert to dense and print:

Dense A_selfadjoint:
 1  0 -1  0  0  0  0  0  0  0
 0  1  0 -1  0  0  0  0  0  0
-1  0  1  0 -1  0  0  0  0  0
 0 -1  0  1  0 -1  0  0  0  0
 0  0 -1  0  1  0 -1  0  0  0
 0  0  0 -1  0  1  0 -1  0  0
 0  0  0  0 -1  0  1  0 -1  0
 0  0  0  0  0 -1  0  1  0 -1
 0  0  0  0  0  0 -1  0  1  0
 0  0  0  0  0  0  0 -1  0  1

with num_samples = 6, structure is the same

Dense A_selfadjoint:
 1  0 -1  0  0  0  0  0  0  0  0  0
 0  1  0 -1  0  0  0  0  0  0  0  0
-1  0  1  0 -1  0  0  0  0  0  0  0
 0 -1  0  1  0 -1  0  0  0  0  0  0
 0  0 -1  0  1  0 -1  0  0  0  0  0
 0  0  0 -1  0  1  0 -1  0  0  0  0
 0  0  0  0 -1  0  1  0 -1  0  0  0
 0  0  0  0  0 -1  0  1  0 -1  0  0
 0  0  0  0  0  0 -1  0  1  0 -1  0
 0  0  0  0  0  0  0 -1  0  1  0 -1
 0  0  0  0  0  0  0  0 -1  0  1  0
 0  0  0  0  0  0  0  0  0 -1  0  1

GDB backtrace shows the segfault in libmetis__genmmd

hayk-skydio commented 2 years ago

@aaron-skydio I notice a bunch of these old tickets are Done in the project but still Open here? Is it a manual process to do both?

aaron-skydio commented 2 years ago

Aah, I'll go through and fix them. It's automated in the reverse direction (closing the issue marks as done in the project), but there's seemingly no way to have marking as done automatically close the issue yet