invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.24k stars 2.4k forks source link

[bug]: segmentation fault on startup [Python 3.11] #4041

Open keturn opened 1 year ago

keturn commented 1 year ago

Is there an existing issue for this?

OS

Linux

GPU

cuda

VRAM

12

What version did you experience this issue on?

bb9460d2781276d688d8da6287957826c2c05023

What happened?

I'm trying to get a development environment going with Python 3.11. Dependencies all installed successfully, but invokeai-web segfaults immediately.

faulthandler log ``` $ PYTHONFAULTHANDLER=True invokeai-web Fatal Python error: Segmentation fault Current thread 0x00007f13ff078000 (most recent call first): File "src/InvokeAI/venv311/lib/python3.11/site-packages/torch/jit/_script.py", line 1345 in script File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/activations_me.py", line 60 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/create_act.py", line 8 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/evo_norm.py", line 32 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/create_norm_act.py", line 12 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/conv_bn_act.py", line 9 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/layers/__init__.py", line 10 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/fx_features.py", line 18 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/helpers.py", line 21 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/beit.py", line 50 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/models/__init__.py", line 1 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/timm/__init__.py", line 2 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/vit.py", line 3 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/blocks.py", line 4 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/midas/dpt_depth.py", line 6 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/api.py", line 9 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/midas/__init__.py", line 11 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module File "", line 690 in _load_unlocked File "", line 1149 in _find_and_load_unlocked File "", line 1178 in _find_and_load File "src/InvokeAI/venv311/lib/python3.11/site-packages/controlnet_aux/__init__.py", line 7 in File "", line 241 in _call_with_frames_removed File "", line 940 in exec_module ... Extension modules: pydantic.typing, pydantic.errors, pydantic.version, pydantic.utils, pydantic.class_validators, pydantic.config, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.error_wrappers, pydantic.fields, pydantic.parse, pydantic.schema, pydantic.main, pydantic.dataclasses, pydantic.annotated_types, pydantic.decorator, pydantic.env_settings, pydantic.tools, pydantic, yaml._yaml, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, charset_normalizer.md, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, PIL._imaging, regex._regex, scipy._lib._ccallback_c, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.linalg._flinalg, scipy.special._ellip_harm_2, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy._lib.messagestream, scipy.optimize._trlib._trlib, numpy.linalg.lapack_lite, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.spatial._ckdtree, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.optimize._direct, pywt._extensions._dwt, pywt._extensions._cwt, pywt._extensions._pywt, pywt._extensions._swt, scipy._lib._uarray._uarray, PIL._imagingft, psutil._psutil_linux, psutil._psutil_posix, skimage._shared.geometry, skimage.measure._find_contours_cy, skimage.measure._marching_cubes_lewiner_cy, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, skimage.measure._moments_cy, scipy.signal._sigtools, scipy.signal._max_len_seq_inner, scipy.signal._upfirdn_apply, scipy.signal._spline, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.signal._sosfilt, scipy.signal._spectral, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.signal._peak_finding_utils, skimage.measure._pnpoly, skimage.measure._ccomp (total: 169) Segmentation fault (core dumped) ```

segfault.txt

Additional context

Using python3.11 package on Ubuntu 22.04.2 LTS.

keturn commented 1 year ago

I see timm in there, and our dependencies are pinned to an old version of that, but upgrading that to the latest didn't help.

The top line is in torch.jit, so I guess something is messed up with my torch installation?

keturn commented 1 year ago

if that line number can be trusted, it's falling over when torch jitscript is trying to copy over the docstring reference?

https://github.com/pytorch/pytorch/blob/e9ebda29d87ce0916ab08c06ab26fd3766a870e5/torch/jit/_script.py#L1345

that is, uh, not something I expected.

keturn commented 1 year ago

https://pytorch.org/docs/stable/jit.html#disable-jit-for-debugging — using PYTORCH_JIT=0 to disable it allows the process to start, but it's obviously not a fix.

keturn commented 1 year ago
Thread 1 "python" received signal SIGSEGV, Segmentation fault.                                                                                                                                                                               0x00000000005266a0 in _PyDictKeys_StringLookup (dk=0x0, key='__doc__') at ../Objects/dictobject.c:1011                                                                                                                                       1011    ../Objects/dictobject.c: No such file or directory.                                                                                                                                                                                  (gdb) bt                                                                                                                                                                                                                                     #0  0x00000000005266a0 in _PyDictKeys_StringLookup (dk=0x0, key='__doc__') at ../Objects/dictobject.c:1011                                                                                                                                   #1  0x0000000000504e03 in specialize_dict_access (kind=<optimized out>, base_op=95, hint_op=159, values_op=154, name=<optimized out>, type=0x448b4e0, instr=0x4cd1ea6, owner=<torch._C.ScriptFunction at remote 0x7fff435acad0>)                 at ../Python/specialize.c:625                                                                                                                                                                                                            #2  _Py_Specialize_StoreAttr (name=<optimized out>, instr=0x4cd1ea6, owner=<torch._C.ScriptFunction at remote 0x7fff435acad0>) at ../Python/specialize.c:813                 

That dk=0x0 -- a null got passed in as the DictKeys object? how does this even happen

keturn commented 1 year ago

building a new version of Python 3.11.4 (using pyenv) instead of using the python3.11 in Ubuntu LTS seems to have fixed things.

So I guess this is not-a-bug?

but maybe we have to explain to people that python 3.11 works unless you're using Ubuntu LTS? ugh.

psychedelicious commented 1 year ago

Yikes. @Millu , let's add a warning in the docs about potential python 3.11 issues on Ubuntu LTS (22.04).

Here's a recipe from @gogurtenjoyer to build python on linux: https://discord.com/channels/1020123559063990373/1049495067846524939/1134255238963011644

Is that the process you followed @keturn ?

keturn commented 1 year ago

No, I used https://github.com/pyenv/pyenv

Millu commented 1 year ago

Seems like this is happening on python 3.10 too 😬

See #3967

keturn commented 1 year ago

Both segfaults, but very different places. This one was at the very start of the process launch, long before being able to attempt image generation.

JohnDevlopment commented 11 months ago

If I can add to this:

./invoke.sh: line 54: 39533 Segmentation fault      (core dumped) invokeai-web $PARAMS
SpecificProtagonist commented 11 months ago
./invoke.sh: line 54: 39533 Segmentation fault      (core dumped) invokeai-web $PARAMS

This also happens on my system (Manjaro), but it might be a different issue because setting PYTORCH_JIT=0 does not fix this issue for me.

arigbs commented 6 months ago

Segmentation fault with fresh install of invoke 4 on Manjaro Linux: invoke.sh: line 37: 29423 Segmentation fault (core dumped) invokeai-web $PARAMS

No ideas how to debug this!

psychedelicious commented 6 months ago

This seems to be dependent on the python version installed. You can try installing the latest python using pyenv or building yourself a fresh python.

arigbs commented 6 months ago

Segmentation fault with fresh install of invoke 4 on Manjaro Linux: invoke.sh: line 37: 29423 Segmentation fault (core dumped) invokeai-web $PARAMS

No ideas how to debug this!

It turns out in my case it was patchmatch issue, I recalled trying to fix a recurrent patchmatch warning by following the steps on the repo about how to stop that warning, so I disabled patchmatch in the invokeai.yaml file and I'm not getting the segmentation fault issue anymore, and the webui loads.

psychedelicious commented 6 months ago

@arigbs that's a good catch. Some of the users who have this error had successfully compiled patchmatch. You'll see in the startup logs.

daleglass commented 4 months ago

Same problem here. Fedora 40, nvidia, Python 3.11, v4.2.2post1

Crashes on startup. Disabling patchmatch in the config fixes it.

heloess commented 3 months ago

Delete python3.11 completely,

sudo apt-get remove python3.11-venv 
sudo apt list --installed | grep python3.11
sudo apt-get purge python3.11
sudo apt-get autoremove
sudo rm -rf /usr/local/lib/python3.11
sudo rm -rf /usr/local/bin/python3.11
sudo apt-get clean
sudo apt-get autoclean

install python3.10-venv

sudo apt install git python3.10-venv -y

It worked for me.