triton-lang / triton

Development repository for the Triton language and compiler
https://triton-lang.org/
MIT License
13.4k stars 1.64k forks source link

Fatal Python error: Aborted #2936

Open RuiWang1998 opened 10 months ago

RuiWang1998 commented 10 months ago

Hi,

I was trying to test https://github.com/NVIDIA/apex/tree/master/apex/contrib/openfold_triton with triton but encountered this error and cannot find the solution anywhere. It'd be great if I could get some pointers to check which part I did wrong.

Fatal Python error: Aborted

Current thread 0x00007f7325dfa280 (most recent call first):
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1006 in ttgir_to_llir
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1554 in <lambda>
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1621 in compile
  File "<string>", line 41 in _attention_core
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 199 in run
  File "/home/rui/code/test-fold-dev/test-fold/ops/attention/triton/mha_fwd.py", line 476 in attn_core_fwd
  File "/home/rui/code/test-fold-dev/tests/unit/ops/attention/flash_attention_test.py", line 18 in test_flash_attention_fwd
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/python.py", line 1792 in runtest
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 169 in pytest_runtest_call
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 262 in <lambda>
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 261 in call_runtest_hook
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 222 in call_and_report
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 133 in runtestprotocol
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 350 in pytest_runtestloop
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 325 in _main
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 271 in wrap_session
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 318 in pytest_cmdline_main
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
  File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/config/__init__.py", line 169 in main
  File "/home/rui/.pycharm_helpers/pycharm/_jb_pytest_runner.py", line 60 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, psutil._psutil_linux, psutil._psutil_posix, lmdb.cpython, Bio.PDB.ccealign, Bio.SeqIO._twoBitIO, yaml._yaml, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pandas._libs.ops, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.indexing, pandas._libs.index, pandas._libs.internals, pandas._libs.join, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, PIL._imagingft, charset_normalizer.md, matplotlib._image, google._upb._message, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.linalg._flinalg, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont (total: 174)

Process finished with exit code 134 (interrupted by signal 6:SIGABRT)

My environment is just pip install torch==2.0.1 with cuda 11.7. The testing sample is the basic one with query shape [256, 4, 256, 16].

In the past, I have already encountered this error which I could circumvent if I adjust the way the grid is defined--if I make the grid 1D, it works just fine, but not grid 2D. However, this is getting complicated with attention and stuff and I was wondering what could be the correct fix?

Best.

jlebar commented 10 months ago

Hi, sorry to hear you're encountering an issue.

Please try with Triton built from head.

If that still does not work, please attach steps to reproduce, and someone might be able to have a look.