Open gchenfc opened 3 years ago
I hate C++ RTTI, we should adopt a LLVM-styled RTTI some day...
*No capacity to work on this for now, need to get ICRA ddl done
Temporarily using this workaround to use name-based type matching:
JacobianFactor::JacobianFactor(const GaussianFactor& gf) {
// Copy the matrix data depending on what type of factor we're copying from
if (!strcmp(typeid(gf).name(), typeid(JacobianFactor).name()))
*this = JacobianFactor(*static_cast<const JacobianFactor*>(&gf));
else if (!strcmp(typeid(gf).name(), typeid(HessianFactor).name()))
*this = JacobianFactor(*static_cast<const HessianFactor*>(&gf));
else
throw std::invalid_argument(
"In JacobianFactor(const GaussianFactor& rhs), rhs is neither a JacobianFactor nor a HessianFactor");
}
And do a similar thing in _convertOrCastToJacobians
. See https://github.com/borglab/gtsam/commit/bb393460d936eb69a1bf6d4eb3c2531416df4717
Note: for constrained optimization, you still need to manually set the linearSolverType param to QR - for some reason GTSAM is also not properly recognizing that the constrained noise models and auto-switching to QR. e.g.
params = gtsam.LevenbergMarquardtParams()
params.setLinearSolverType("MULTIFRONTAL_QR")
result = gtsam.LevenbergMarquardtOptimizer(graph, init, params).optimize()
I'm leaving this open because it needs to eventually be fixed, but this workaround works for now.
Yeah we'll figure it out after ICRA :)
Just checking, does anyone else get this error? For me just compiling everything standard and running make python-test
in gtdynamics produces this issue in the Simulator. Alternatively, minimum failing example (copy-pasted from above):
import gtsam
import gtdynamics as gtd
def main():
f = gtd.MinTorqueFactor(0, gtsam.noiseModel.Unit.Create(1))
init = gtsam.Values()
init.insert(0, 0.0)
flin = f.linearize(init)
flin.print("Linearized Factor: ")
print("\nLinearized factor has type: ", type(flin), '\n')
# output: Linearized factor has type: <class 'gtsam.gtsam.JacobianFactor'>
gtsam.JacobianFactor(flin) # this throws an error
if __name__ == '__main__':
main()
@varunagrawal @yetongumich @ProfFan
I think I already fixed this in GTSAM, maybe the GTD wrapper does not have such changes applied.
Search for RTTI in the GTSAM Issues
Are you referring to borglab/gtsam#218 ?
If so, it looks like -fvisibility-ms-compat
flag is already set inside function pybind_wrap
But this definitely sounds like the right issue (Mac dynamic linking related)
AFAIK I am suspecting that we have some violation of ODR such that multiple copies of the same class is generated.
Have no proof, the only way to be sure is adding prints in the JacobianFactor constructor and print out the typeinfo
struct.
If that were the case then wouldn't this be affecting all systems and not just mac?
No, this linking fiasco only happens with macOS... I don't exactly know why, but the visibility of the symbols of dynamic libs on macOS seems to be the culprit.
I also think it would be really helpful if you can check if the GTD lib and GTSAM is both generating typeinfo
for JacobianFactor
.
Sorry for forgetting about this for a long time.
In JacobianFactor constructor, I added:
std::cout << "Constructing a Jacobian factor from a Gaussian Factor!!\n" //
<< "typeid of JacobianFactor is " << typeid(JacobianFactor).name()
<< '\t' << typeid(JacobianFactor).hash_code() << "\n" //
<< "typeid of input argument is " << typeid(gf).name() //
<< '\t' << typeid(gf).hash_code() << std::endl;
Then in GTDynamics MinTorqueFactor construtor, I added
std::cout << "Constructing a MinTorqueFactor..." << std::endl;
std::cout << "typeid of gtdynamics' JacobianFactor is "
<< typeid(gtsam::JacobianFactor).name() << '\t'
<< typeid(gtsam::JacobianFactor).hash_code() << std::endl;
The test code in c++:
auto values = gtsam::Values();
values.insert(0, 0.0);
{
auto f = gtsam::PriorFactor(0, 0.0, gtsam::noiseModel::Unit::Create(1));
auto gaussian_factor = f.linearize(values);
gtsam::JacobianFactor jacobian_factor(*gaussian_factor);
}
{
auto f = gtdynamics::MinTorqueFactor(0, gtsam::noiseModel::Unit::Create(1));
auto gaussian_factor = f.linearize(values);
gtsam::JacobianFactor jacobian_factor(*gaussian_factor);
}
and in python:
values = gtsam.Values()
values.insert(0, 0.0)
f = gtsam.PriorFactorDouble(0, 0.0, gtsam.noiseModel.Unit.Create(1))
gaussian_factor = f.linearize(values)
jacobian_factor = gtsam.JacobianFactor(gaussian_factor)
f = gtd.MinTorqueFactor(0, gtsam.noiseModel.Unit.Create(1))
gaussian_factor = f.linearize(values)
jacobian_factor = gtsam.JacobianFactor(gaussian_factor) # this throws an error
c++ results:
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE 4351243596
typeid of input argument is N5gtsam14JacobianFactorE 4351243596
Constructing a MinTorqueFactor...
typeid of gtdynamics' JacobianFactor is N5gtsam14JacobianFactorE 4351243596
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE 4351243596
typeid of input argument is N5gtsam14JacobianFactorE 4351243596
python results:
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE 4398406401
typeid of input argument is N5gtsam14JacobianFactorE 4398406401
Constructing a MinTorqueFactor...
typeid of gtdynamics' JacobianFactor is N5gtsam14JacobianFactorE 4796888396
Constructing a Jacobian factor from a Gaussian Factor!!
typeid of JacobianFactor is N5gtsam14JacobianFactorE 4398406401
typeid of input argument is N5gtsam14JacobianFactorE 4796888396
Traceback (most recent call last):
File "/Users/gerry/GIT_REPOS/GTDynamics/gtdynamics/cablerobot/src/test.py", line 22, in <module>
main()
File "/Users/gerry/GIT_REPOS/GTDynamics/gtdynamics/cablerobot/src/test.py", line 14, in main
jacobian_factor = gtsam.JacobianFactor(gaussian_factor) # this throws an error
ValueError: In JacobianFactor(const GaussianFactor& rhs), rhs is neither a JacobianFactor nor a HessianFactor
So I think this is an indication that the python wrapper is using a different version of gtdynamics, so maybe we just need GTSAM_EXPORT in gtdynamics?
Going to continue working on this.
I figured out the problem: my gtdynamics.cpython-39-darwin.so
and gtsam.cpython-39-darwin.so
files are linking to different places. Specifically:
gtdynamics.cpython-39-darwin.so
is linking to /Users/gerry/lib/libgtsamDebug.4.dylib
while gtsam.cpython-39-darwin.so
is linking to /Users/gerry/GIT_REPOS/gtsam/build/gtsam/libgtsamDebug.4.dylib
(found using otool -l
)
Note that also gtdynamics.cpython-39-darwin.so
is using @rpath/libgtdynamicsDebug.dylib
, but rpath is set to:
Load command 26
cmd LC_RPATH
cmdsize 64
path /Users/gerry/GIT_REPOS/GTDynamics/build/gtdynamics (offset 12)
Load command 27
cmd LC_RPATH
cmdsize 32
path /Users/gerry/lib (offset 12)
so actually I think it's also going to be linking to the "wrong" file.
(the -fvisibility-ms-compat was not the issue - linking to the correct executable fixes the problem)
I don't know if this is specific to my machine or not. Would someone else mind pasting their output of:
otool -l /Users/gerry/miniforge3/lib/python3.9/site-packages/gtdynamics-1.0.0-py3.9.egg/gtdynamics/gtdynamics.cpython-39-darwin.so
otool -l /Users/gerry/miniforge3/lib/python3.9/site-packages/gtsam-4.1.0-py3.9.egg/gtsam/gtsam.cpython-39-darwin.so
(replacing /Users/gerry/miniforge3 with wherever your python packages are; the folder can be found using python -c 'import gtsam; print(gtsam.__path__)'
)
??? @varunagrawal @ProfFan
Description
I believe something is wrong with constructing
JacobianFactor
s from the result of alinearize()
operation when using the python wrapper. Tagging @varunagrawal and @ProfFan due to the fact this is python related.Specifically, the RTTI dynamic casts used in
JacobianFactor
constructors do not seem to be working properly. Here are two example usages:https://github.com/borglab/gtsam/blob/ff7ddf48bd8bb2f993437f01cd32a4727d6f023b/gtsam/linear/JacobianFactor.cpp#L58-L67
https://github.com/borglab/gtsam/blob/ff7ddf48bd8bb2f993437f01cd32a4727d6f023b/gtsam/linear/JacobianFactor.cpp#L196-L211
For whatever reason, nonlinear factors constructed in python, which are then called with
.linearize()
to create aJacobianFactor
, cannot be passed into theJacobianFactor(factor)
constructors becauseboost::dynamic_pointer_cast<JacobianFactor>(factor)
returns a null pointer.Steps to reproduce
This example python script fails:
Which throws the error:
Expected behavior
The python result should match that of this example C++ script succeeds:
Environment
MacOS 11.3.1, MacBook Air (M1, 2020), AppleClang 12.0.5.12050022 gtsam hash b99bf4e92912f4ad0020f79d6b222a3d6593514f
I also experienced similar errors when I ran on an ubuntu intel lab machine, but I didn't rigorously test these particular test files.
Additional information
This only occurs for nonlinear factors defined outside of the GTSAM repo. So, for example, PriorFactor, BetweenFactor, etc all still work as expected.
This causes noticeable failures on optimizations with Constrained noise models, may be causing silent correctness errors on other optimizations, and is likely causing all non-GTSAM factors to be calling the HessianFactor constructor instead of the JacobianFactor constructor (performance penalty).