Using MotionGen.plan_batch() crashes

cremebrule commented 11 months ago

If it’s not a bug, please use discussions: https://github.com/NVlabs/curobo/discussions

Please provide the below information in addition to your issue:

cuRobo installation mode (choose from [python, isaac sim, docker python, docker isaac sim]): isaac sim
python version: 3.10.13
Isaac Sim version (if using): 2023.1.0

Issue Details

Running MotionGen.plan_single(cu_js, ik_goal, plan_cfg) works fine; however, MotionGen.plan_batch(cu_js_batch, ik_goal_batch, plan_cfg) crashes, where cu_js_batch and ik_goal_batch are simply the stacked versions of cu_js and ik_goal (e.g.: Pose consists of (5,3)-shaped positions and (5,4)-shaped quaternions instead of (3,) and (4,), respectively). The parameters used for instantiating MotionGen and plan_cfg are identical in both cases.

Is there an example use case in IsaacSim that I can compared / debug against? There's batch_motion_gen_reacher.py but that utilizes multiple environments. I have a single environment and would just like to evaluate multiple goal poses in parallel for that single world.

Huge thanks!

balakumar-s commented 11 months ago

Here is an example: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/examples/motion_gen_example.py#L222

If you want to generate one trajectory that reaches one of the goals (goalset planning), you can also try motion_gen.plan_goalset: https://github.com/NVlabs/curobo/blob/c09d94908d8ad52f557ed59195c1ceb2d0434d65/tests/motion_gen_module_test.py#L109

What does the crash report?

cremebrule commented 11 months ago

Huge thanks for the rapid response! I'll try out these examples, compare, and report back.

Here is the current error I've been getting:

terminate called after throwing an instance of 'c10::Error'
  what():  uc >= 0 INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1367, please report a bug to PyTorch. 
Exception raised from notifyCaptureDestroy at ../c10/cuda/CUDACachingAllocator.cpp:1367 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fabab1484d7 in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x68 (0x7fabab112434 in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libc10.so)
frame #2: <unknown function> + 0x27f6f (0x7fabab1c0f6f in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libc10_cuda.so)
frame #3: at::cuda::CUDAGraph::reset() + 0x3f (0x7fabac4e73df in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libtorch_cuda.so)
frame #4: at::cuda::CUDAGraph::~CUDAGraph() + 0xe (0x7fabac4e7c7e in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libtorch_cuda.so)
frame #5: std::_Sp_counted_ptr<at::cuda::CUDAGraph*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x12 (0x7fac12186a62 in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xb17b40 (0x7fac12186b40 in /scr/jdwong/isaac_sim-2023.1.0/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x57c2a9 (0x7facceabe2a9 in /cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/site-packages/open3d/cpu/pybind.cpython-310-x86_64-linux-gnu.so)
<omitting python frames>
frame #52: __libc_start_main + 0xf3 (0x7facdae34083 in /lib/x86_64-linux-gnu/libc.so.6)

Fatal Python error: Aborted

Thread 0x00007f9b8ffff700 (most recent call first):
  <no Python frame>

Thread 0x00007fa95d3cd700 (most recent call first):
  File "/cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/concurrent/futures/thread.py", line 81 in _worker
  File "/cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/threading.py", line 953 in run
  File "/cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007facdae03740 (most recent call first):
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/opt/newton/newton_base.py", line 537 in _create_opt_iters_graph
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/opt/newton/newton_base.py", line 491 in _initialize_opt_iters_graph
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/opt/newton/newton_base.py", line 142 in _optimize
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/opt/opt_base.py", line 96 in optimize
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/wrap_base.py", line 71 in optimize
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/wrap_base.py", line 139 in solve
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/ik_solver.py", line 675 in solve_from_solve_state
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/ik_solver.py", line 537 in solve_batch
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/ik_solver.py", line 406 in solve_any
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/motion_gen.py", line 882 in _solve_ik_from_solve_state
  File "/cvgl2/u/jdwong/miniforge3/envs/s4r/lib/python3.10/contextlib.py", line 79 in inner
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/motion_gen.py", line 1812 in _plan_from_solve_state_batch
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/motion_gen.py", line 1142 in _plan_batch_attempts
  File "/cvgl2/u/jdwong/PAIR/curobo/src/curobo/wrap/reacher/motion_gen.py", line 1261 in plan_batch
  File "/cvgl2/u/jdwong/PAIR/doppelmaker/doppelmaker/utils/curobo_utils.py", line 267 in compute_trajectory
  File "/cvgl2/u/jdwong/PAIR/doppelmaker/tests/test_curobo_with_grasp.py", line 267 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, markupsafe._speedups, sklearn.__check_build._check_build, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, psutil._psutil_linux, psutil._psutil_posix, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, numpy.linalg.lapack_lite, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._mvn, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._rcont.rcont, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_fast, sklearn.neighbors._partition_nodes, sklearn.neighbors._ball_tree, sklearn.neighbors._kd_tree, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.linear_model._cd_fast, sklearn._loss._loss, sklearn.utils.arrayfuncs, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.decomposition._online_lda_fast, sklearn.decomposition._cdnmf_fast, yaml._yaml, PIL._imaging, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pyarrow.lib, pyarrow._hdfsio, pandas._libs.ops, pyarrow._compute, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, PIL._imagingft, matplotlib._c_internal_utils, matplotlib._path, kiwisolver._cext, matplotlib._image, pydantic.typing, pydantic.errors, pydantic.version, pydantic.class_validators, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.types, pydantic.json, pydantic.env_settings, pydantic.tools, omni.mdl.pymdlsdk._pymdlsdk, osqp._osqp, multidict._multidict, yarl._quoting_c, aiohttp._helpers, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket, cchardet._cchardet, _cffi_backend, frozenlist._frozenlist, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, xxhash._xxhash, embreex.rtcore, embreex.rtcore_scene, embreex.mesh_construction, lxml._elementpath, lxml.etree, shapely.lib, shapely._geos, shapely._geometry_helpers, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box (total: 253)
Aborted (core dumped)

balakumar-s commented 11 months ago

Can you comment out motion_gen.warmup() in your code? This happens when you change the batch size or planning mode after warmup

cremebrule commented 11 months ago

Ahhh interesting -- just tried now, and it seems to be working! Huge thanks for the fix.

So when should motion_gen.warmup() be used? I assumed it was necessary all the time, but perhaps it should only be used for single env / single batches?

If I don't call warmup, can I still call different motion_gen methods, e.g.: plan_single and plan_batch and plan_goalset within the same runtime?

Huge thanks for the assistance!!

balakumar-s commented 11 months ago

motion_gen.warmup() can be used for single env or batch. But the batch size needs to be known before calling warmup().

cuRobo uses cuda graphs to compute motions. Hence, we cannot currently change the problem type or tensor shapes once initialized. You can use cuRobo without cuda graph by initializing motion_gen.load_from_robot_config(use_cuda_graph=False, ..). This would allow you to change the problem type and batch size between calls. However, the motion generation compute time could be 10x slower.

NVlabs / curobo

Using MotionGen.plan_batch() crashes #94

Issue Details