dcharatan / flowmap

Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann
https://cameronosmith.github.io/flowmap/
MIT License
873 stars 84 forks source link

Seg Fault in "Compute optical flow and tracks" #14

Open shankar-anantak opened 5 months ago

shankar-anantak commented 5 months ago

Hello,

I successfully ran the subsampling preprocessing script, however when i run the overfit script:

python3 -m flowmap.overfit dataset=images dataset.images.root=/.../.../.../.../flowmap/frames

Computing RAFT flow: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:43<00:00,  2.41s/it]
Computing RAFT flow: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:43<00:00,  2.42s/it]
Using cache found in /home/dev/.cache/torch/hub/facebookresearch_co-tracker_v1.0
Segmentation fault (core dumped)

I get this segmentation fault, unsure how to proceed here. Any help would be greatly appreciated.

dcharatan commented 5 months ago

I haven't seen this happen before. What happens when you enable fault handler?

import faulthandler; faulthandler.enable()

Also, what environment are you running on?

shankar-anantak commented 5 months ago
(venv) dev@dev-9000:~/dev/Pipelines/flowmap$ python3
Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faulthandler;
>>> faulthandler.enable()
>>>
>>>

Seems like no issue w/ importing and enabling faulthandler.

wrt my environment, i setup the venv the exact same way as described in the readme:

(venv) dev@dev-9000:~/dev/Pipelines/flowmap$ pip list
Package                  Version
------------------------ -----------
aiohttp                  3.9.5
aiosignal                1.3.1
antlr4-python3-runtime   4.9.3
appdirs                  1.4.4
attrs                    23.2.0
beartype                 0.18.5
black                    24.4.2
certifi                  2024.2.2
charset-normalizer       3.3.2
click                    8.1.7
contourpy                1.2.1
cycler                   0.12.1
dacite                   1.8.1
docker-pycreds           0.4.0
einops                   0.8.0
filelock                 3.13.4
flow-vis-torch           0.1
fonttools                4.51.0
frozenlist               1.4.1
fsspec                   2024.3.1
gitdb                    4.0.11
GitPython                3.1.43
huggingface-hub          0.22.2
hydra-core               1.3.2
idna                     3.7
jaxtyping                0.2.28
Jinja2                   3.1.3
kiwisolver               1.4.5
lightning                2.2.3
lightning-utilities      0.11.2
MarkupSafe               2.1.5
matplotlib               3.8.4
mpmath                   1.3.0
multidict                6.0.5
mypy-extensions          1.0.0
networkx                 3.3
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.20.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.1.105
omegaconf                2.3.0
packaging                24.0
pathspec                 0.12.1
pillow                   10.3.0
pip                      22.0.2
platformdirs             4.2.1
plyfile                  1.0.3
protobuf                 4.25.3
psutil                   5.9.8
pyparsing                3.1.2
python-dateutil          2.9.0.post0
pytorch-lightning        2.2.3
PyYAML                   6.0.1
requests                 2.31.0
ruff                     0.4.2
safetensors              0.4.3
scipy                    1.13.0
sentry-sdk               2.0.1
setproctitle             1.3.3
setuptools               59.6.0
six                      1.16.0
smmap                    5.0.1
sympy                    1.12
timm                     0.9.16
torch                    2.3.0
torchaudio               2.3.0
torchmetrics             1.3.2
torchvision              0.18.0
tqdm                     4.66.2
triton                   2.3.0
typeguard                2.13.3
typing_extensions        4.11.0
urllib3                  2.2.1
wandb                    0.16.6
wheel                    0.43.0
yarl                     1.9.4

Your help is greatly appreciated

dcharatan commented 5 months ago

You'll have to add import faulthandler; faulthandler.enable() at the very top of flowmap/overfit.py so that it gets enabled when running the code. If you do this, more information about the segfault will be printed. Running it inside the Python interpreter in the terminal won't do anything.

BiaBibii commented 5 months ago

I also got the same error, this is the output with import faulthandler; faulthandler.enable() added in the code.

(venv_requirments) :~/thesis-nets/flowmap$ python3.11 -m flowmap.overfit dataset=images dataset.images.root=/home/thesis-nets/flowmap/birds_compact/birds_compact
rm: cannot remove 'outputs/local': No such file or directory
Precomputing optical flow.
Computing RAFT flow: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.74s/it]
Computing RAFT flow: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.43s/it]
Using cache found in /home/.cache/torch/hub/facebookresearch_co-tracker_v1.0
Fatal Python error: Segmentation fault

Thread 0x00007f4978dfd640 (most recent call first):
  File "/usr/lib/python3.11/threading.py", line 324 in wait
  File "/usr/lib/python3.11/threading.py", line 622 in wait
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/tqdm/_monitor.py", line 60 in run
  File "/usr/lib/python3.11/threading.py", line 1038 in _bootstrap_inner
  File "/usr/lib/python3.11/threading.py", line 995 in _bootstrap

Current thread 0x00007f4ace531000 (most recent call first):
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/jit/_script.py", line 1399 in script
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/activations_me.py", line 60 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/create_act.py", line 8 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/classifier.py", line 14 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/layers/__init__.py", line 7 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/timm/__init__.py", line 2 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/e/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/models/core/cotracker/blocks.py", line 12 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/he/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/models/core/cotracker/cotracker.py", line 11 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/cotracker/predictor.py", line 11 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/hubconf.py", line 17 in _make_cotracker_predictor
  File "/home/.cache/torch/hub/facebookresearch_co-tracker_v1.0/hubconf.py", line 32 in cotracker_w8
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/hub.py", line 597 in _load_local
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/torch/hub.py", line 568 in load
  File "/home/thesis-nets/flowmap/flowmap/tracking/track_predictor_cotracker.py", line 23 in __init__
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
  File "/home/thesis-nets/flowmap/flowmap/tracking/__init__.py", line 26 in get_track_predictor
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
  File "/home/thesis-nets/flowmap/flowmap/tracking/__init__.py", line 87 in compute_tracks
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/jaxtyping/_decorator.py", line 450 in wrapped_fn
  File "/home/thesis-nets/flowmap/flowmap/overfit.py", line 68 in overfit
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/core/utils.py", line 186 in run_job
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/hydra.py", line 119 in run
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 458 in <lambda>
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 220 in run_and_report
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 457 in _run_app
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/_internal/utils.py", line 394 in _run_hydra
  File "/home/thesis-nets/flowmap/venv_requirments/lib/python3.11/site-packages/hydra/main.py", line 94 in decorated_main
  File "/home/thesis-nets/flowmap/flowmap/overfit.py", line 157 in <module>
  File "<frozen runpy>", line 88 in _run_code
  File "<frozen runpy>", line 198 in _run_module_as_main

Extension modules: yaml._yaml, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, charset_normalizer.md, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, matplotlib._image, PIL._imagingft, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, PIL._imagingmath (total: 72)
Segmentation fault
shankar-anantak commented 5 months ago

I also got the same error, this is the output with import faulthandler; faulthandler.enable() added in the code.

+1 here, very similar stack trace w/ faulthandler

dcharatan commented 5 months ago

This seems to be a problem with the CoTracker code from Torch Hub, but I'm not sure why it's segfaulting. Here are a few things to try:

BiaBibii commented 5 months ago

I did with boths requirements files, and get the same problem, will try the second point: Directly cloning CoTracker or integrating it as a submodule to avoid using the Torch Hub version. Thanks.

phongnhhn92 commented 3 months ago

A very quick fix for me is to change this line of code to the new version of CoTracker.

self.tracker = torch.hub.load("facebookresearch/co-tracker", "cotracker2")

dcharatan commented 3 months ago

@phongnhhn92 Be careful—we found that CoTracker v2 produces less accurate tracks than v1. This might impact performance more than you would like.

phongnhhn92 commented 3 months ago

@dcharatan Hmm, so I would need to manually download the weight of the V1 CoTracker here and load it to reproduce your results?

dcharatan commented 3 months ago

Yes, to reproduce the results, you'll need CoTracker V1's weights. It might be worth trying out the steps here to avoid having to manually download the weights.