med-air / Endo-FM

[MICCAI'23] Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train
Apache License 2.0
143 stars 14 forks source link

Downstream KUMC test compile issue #21

Open zkysss11235 opened 6 days ago

zkysss11235 commented 6 days ago

I followed the guide in ReadMe and compile the STFT using a RTX4090. It successfully compiled but, when I run the finetuning, it outputs the following error:

Traceback (most recent call last): File "tools/train_net.py", line 16, in from stft_core.engine.inference import inference File "/home/kyz/repos/Endo-FM/STFT/stft_core/engine/inference.py", line 9, in from stft_core.data.datasets.evaluation import evaluate File "/home/kyz/repos/Endo-FM/STFT/stft_core/data/datasets/evaluation/init.py", line 3, in from .vid import vid_evaluation File "/home/kyz/repos/Endo-FM/STFT/stft_core/data/datasets/evaluation/vid/init.py", line 7, in from .vid_eval import do_vid_evaluation File "/home/kyz/repos/Endo-FM/STFT/stft_core/data/datasets/evaluation/vid/vid_eval.py", line 11, in from stft_core.structures.boxlist_ops import boxlist_iou File "/home/kyz/repos/Endo-FM/STFT/stft_core/structures/boxlist_ops.py", line 6, in from stft_core.layers import nms as _box_nms File "/home/kyz/repos/Endo-FM/STFT/stft_core/layers/init.py", line 10, in from .nms import nms File "/home/kyz/repos/Endo-FM/STFT/stft_core/layers/nms.py", line 3, in from stft_core import _C ImportError: /home/kyz/repos/Endo-FM/STFT/stft_core/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: PyCMethod_New Killing subprocess 870160 Traceback (most recent call last): File "/data/kyz/anaconda3/envs/endofm/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data/kyz/anaconda3/envs/endofm/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data/kyz/anaconda3/envs/endofm/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in main() File "/data/kyz/anaconda3/envs/endofm/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/data/kyz/anaconda3/envs/endofm/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/data/kyz/anaconda3/envs/endofm/bin/python', '-u', 'tools/train_net.py', '--local_rank=0', '--master_port=15153', '--config-file', 'configs/STFT/kumc_R_50_STFT.yaml', 'OUTPUT_DIR', '/home/kyz/annotations+scripts+outputs/outputs/Endo-FM/downstream']' returned non-zero exit status 1.

Do I need to change anything in the compile setting?

Kyfafyd commented 6 days ago

Hi @zkysss11235 thanks for your interest! I have not met such problem before. It looks like the compilation is not totally successful. May you refer to this https://github.com/pybind/pybind11/issues/3115 and try to build an environment with another python version?