Open shankar-anantak opened 6 months ago
I haven't seen this happen before. What happens when you enable fault handler?
import faulthandler; faulthandler.enable()
Also, what environment are you running on?
(venv) dev@dev-9000:~/dev/Pipelines/flowmap$ python3
Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faulthandler;
>>> faulthandler.enable()
>>>
>>>
Seems like no issue w/ importing and enabling faulthandler.
wrt my environment, i setup the venv the exact same way as described in the readme:
(venv) dev@dev-9000:~/dev/Pipelines/flowmap$ pip list
Package Version
------------------------ -----------
aiohttp 3.9.5
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
attrs 23.2.0
beartype 0.18.5
black 24.4.2
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
contourpy 1.2.1
cycler 0.12.1
dacite 1.8.1
docker-pycreds 0.4.0
einops 0.8.0
filelock 3.13.4
flow-vis-torch 0.1
fonttools 4.51.0
frozenlist 1.4.1
fsspec 2024.3.1
gitdb 4.0.11
GitPython 3.1.43
huggingface-hub 0.22.2
hydra-core 1.3.2
idna 3.7
jaxtyping 0.2.28
Jinja2 3.1.3
kiwisolver 1.4.5
lightning 2.2.3
lightning-utilities 0.11.2
MarkupSafe 2.1.5
matplotlib 3.8.4
mpmath 1.3.0
multidict 6.0.5
mypy-extensions 1.0.0
networkx 3.3
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
omegaconf 2.3.0
packaging 24.0
pathspec 0.12.1
pillow 10.3.0
pip 22.0.2
platformdirs 4.2.1
plyfile 1.0.3
protobuf 4.25.3
psutil 5.9.8
pyparsing 3.1.2
python-dateutil 2.9.0.post0
pytorch-lightning 2.2.3
PyYAML 6.0.1
requests 2.31.0
ruff 0.4.2
safetensors 0.4.3
scipy 1.13.0
sentry-sdk 2.0.1
setproctitle 1.3.3
setuptools 59.6.0
six 1.16.0
smmap 5.0.1
sympy 1.12
timm 0.9.16
torch 2.3.0
torchaudio 2.3.0
torchmetrics 1.3.2
torchvision 0.18.0
tqdm 4.66.2
triton 2.3.0
typeguard 2.13.3
typing_extensions 4.11.0
urllib3 2.2.1
wandb 0.16.6
wheel 0.43.0
yarl 1.9.4
Your help is greatly appreciated
You'll have to add import faulthandler; faulthandler.enable()
at the very top of flowmap/overfit.py
so that it gets enabled when running the code. If you do this, more information about the segfault will be printed. Running it inside the Python interpreter in the terminal won't do anything.
I also got the same error, this is the output with import faulthandler; faulthandler.enable()
added in the code.
(venv_requirments) :~/thesis-nets/flowmap$ python3.11 -m flowmap.overfit dataset=images dataset.images.root=/home/thesis-nets/flowmap/birds_compact/birds_compact
rm: cannot remove 'outputs/local': No such file or directory
Precomputing optical flow.
Computing RAFT flow: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.74s/it]
Computing RAFT flow: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.43s/it]
Using cache found in /home/.cache/torch/hub/facebookresearch_co-tracker_v1.0
Fatal Python error: Segmentation fault
Thread 0x00007f4978dfd640 (most recent call first):
File "/usr/lib/python3.11/threading.py", line 324 in wait
File "/usr/lib/python3.11/threading.py", line 622 in wait
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/tqdm/_monitor.py", line 60 in run
File "/usr/lib/python3.11/threading.py", line 1038 in _bootstrap_inner
File "/usr/lib/python3.11/threading.py", line 995 in _bootstrap
Current thread 0x00007f4ace531000 (most recent call first):
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/jit/_script.py", line 1399 in script
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/activations_me.py", line 60 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/create_act.py", line 8 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/classifier.py", line 14 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/__init__.py", line 7 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/__init__.py", line 2 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/e/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/models/core/cotracker/blocks.py", line 12 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/he/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/models/core/cotracker/cotracker.py", line 11 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/predictor.py", line 11 in <module>
File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 940 in exec_module
File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/hubconf.py", line 17 in _make_cotracker_predictor
File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/hubconf.py", line 32 in cotracker_w8
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/hub.py", line 597 in _load_local
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/hub.py", line 568 in load
File "/home/thesis-nets/flowmap/flowmap/tracking/track_predictor_cotracker.py", line 23 in __init__
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
File "/home/thesis-nets/flowmap/flowmap/tracking/__init__.py", line 26 in get_track_predictor
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
File "/home/thesis-nets/flowmap/flowmap/tracking/__init__.py", line 87 in compute_tracks
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
File "/home/thesis-nets/flowmap/flowmap/overfit.py", line 68 in overfit
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/core/utils.py", line 186 in run_job
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/hydra.py", line 119 in run
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 458 in <lambda>
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 220 in run_and_report
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 457 in _run_app
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 394 in _run_hydra
File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/main.py", line 94 in decorated_main
File "/home/thesis-nets/flowmap/flowmap/overfit.py", line 157 in <module>
File "<frozen runpy>", line 88 in _run_code
File "<frozen runpy>", line 198 in _run_module_as_main
Extension modules: yaml._yaml, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, charset_normalizer.md, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, matplotlib._image, PIL._imagingft, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, PIL._imagingmath (total: 72)
Segmentation fault
I also got the same error, this is the output with
import faulthandler; faulthandler.enable()
added in the code.
+1 here, very similar stack trace w/ faulthandler
This seems to be a problem with the CoTracker code from Torch Hub, but I'm not sure why it's segfaulting. Here are a few things to try:
requirements_exact.txt
to exactly match an environment that's known to workI did with boths requirements files, and get the same problem, will try the second point: Directly cloning CoTracker or integrating it as a submodule to avoid using the Torch Hub version. Thanks.
A very quick fix for me is to change this line of code to the new version of CoTracker.
self.tracker = torch.hub.load("facebookresearch/co-tracker", "cotracker2")
@phongnhhn92 Be careful—we found that CoTracker v2 produces less accurate tracks than v1. This might impact performance more than you would like.
@dcharatan Hmm, so I would need to manually download the weight of the V1 CoTracker here and load it to reproduce your results?
Hello,
I successfully ran the subsampling preprocessing script, however when i run the overfit script:
python3 -m flowmap.overfit dataset=images dataset.images.root=/.../.../.../.../flowmap/frames
I get this segmentation fault, unsure how to proceed here. Any help would be greatly appreciated.