zju3dv / object_nerf

Code for "Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering", ICCV 2021
https://zju3dv.github.io/object_nerf/
MIT License
314 stars 23 forks source link

segmentation fault when training #11

Closed TxT1212 closed 2 years ago

TxT1212 commented 2 years ago

Hi, thank you for your great work. I keep run into segmentation fault when trying to train on Toydesk dataset. It's confusing. Please help.

 python train.py dataset_config=config/toy_desk_2.yml "img_wh=[640,480]" exp_name=my_expr_toydesk_2
  ----------------------------------------
  exp_name: my_expr_toydesk_2
  ckpt_path: null
  dataset_name: toydesk
  img_wh:
  - 640
  - 480
  model:
    use_voxel_embedding: true
    N_freq_xyz: 10
    N_freq_dir: 4
    N_freq_voxel: 6
    D: 8
    W: 256
    skips:
    - 4
    N_scn_voxel_size: 16
    inst_D: 4
    inst_W: 128
    inst_skips:
    - 2
    N_obj_voxel_size: 8
    N_samples: 64
    N_importance: 64
    frustum_bound: -1
    use_disp: false
    perturb: 1
    noise_std: 1
    use_mask: true
    N_vocab: 1000
    N_max_objs: 64
    N_obj_code_length: 64
    N_max_voxels: 800000
  train:
    progressive_train: false
    batch_size: 2048
    chunk: 32768
    num_epochs: 30
    num_gpus: 1
    optimizer: adam
    lr: 0.001
    momentum: 0.9
    weight_decay: 0
    lr_scheduler: poly
    warmup_multiplier: 1
    warmup_epochs: 0
    decay_step:
    - 20
    decay_gamma: 0.1
    poly_exp: 2
    limit_train_batches: 0.05
  prefixes_to_ignore:
  - loss
  loss:
    color_loss_weight: 1.0
    depth_loss_weight: 0.1
    opacity_loss_weight: 10.0
    instance_color_loss_weight: 1.0
    instance_depth_loss_weight: 0.1
  dataset_extra:
    enable_observation_check: false
    max_obs_angle: 40
    max_obs_distance: 3.0
    mask_rebalance_strategy: fg_bg_reweight
    fg_weight: 1.0
    bg_weight: 0.05
    use_bbox: false
    use_bbox_only_for_test: true
    near: 0.8
    far: 24.0
    scale_factor: 16.0
    scene_center:
    - 0.2
    - 1.4
    - 7.1
    train_start_idx: 0
    train_skip_step: 1
    train_max_size: 9999
    validate_idx: 131
    split: datasets/split/our_desk_2_train_0.8
    use_instance_mask: true
    root_dir: data/toydesk/our_desk_2
    bbox_dir: datasets/desk_bbox/desk2/bbox.json
    inst_seg_tag: instance
    val_instance_id: 1
    instance_id:
    - 5
    - 4
    - 2
    - 1
    - 3
    bg_instance_id:
    - 0
    pcd_path: data/toydesk/our_desk_2/pcd_from_mesh.ply
    voxel_size: 0.3
    neighbor_marks: 3
  dataset_config: config/toy_desk_2.yml

  ----------------------------------------
  Start with exp_name: 220720_142351_my_expr_toydesk_2.
  Filling the voxel_occupancy...
  Voxel generated: torch.Size([214, 113, 171]) Voxel occupancy ratio: tensor(0.0426, device='cuda:0')
  Voxel used: tensor(176312, device='cuda:0')
  INFO - 2022-07-20 14:23:55,338 - trainer - GPU available: True, used: True
  INFO - 2022-07-20 14:23:55,338 - trainer - TPU available: False, using: 0 TPU cores
  INFO - 2022-07-20 14:23:55,338 - trainer - IPU available: False, using: 0 IPUs
  Training split count 121
  Train idx: 77 -> 115, skip: 1
  Read meta 00119 : 00119 instance 3
  ----------------------------------------
  Valid idx: 131
  INFO - 2022-07-20 14:24:12,671 - gpu - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [2]
  /home/tangxiaotian/miniforge3/envs/object_nerf/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:623: UserWarning: Checkpoint directory /home/tangxiaotian/object_nerf/logs/220720_142351_my_expr_toydesk_2 exists and is not empty.
    rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
  Epoch 0:   0%|                           | 4/904 [00:06<25:35,  1.71s/it, loss=1.04, v_num=0, train/psnr=10.10]段错误 (核心已转储)
ybbbbt commented 2 years ago

Hi, TxT1212,

The log you provided does not seem to be complete, as it does not contain information related to segmentation fault. Can you provide a log with error messages?

TxT1212 commented 2 years ago

I ran python train.py dataset_config=config/toy_desk_2.yml "img_wh=[640,480]" exp_name=my_expr_toydesk_2 and got this error message.

I notice there is a "logs" folder, however, there seem be no log file in it. Where can I access the log with error message?

ybbbbt commented 2 years ago

Okay, I see the error message '段错误 (核心已转储)' at the last line of your printed log. However, this is quite rare in pytorch. Maybe you can try faulthandler and see where the program crashes. I guess the main reason might be some environment issue or hardware issue.

TxT1212 commented 2 years ago

Could you show me “conda list”? I think there is something wrong in my environment. here is mine:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
absl-py                   1.1.0                    pypi_0    pypi
addict                    2.4.0                    pypi_0    pypi
aiohttp                   3.8.1                    pypi_0    pypi
aiosignal                 1.2.0                    pypi_0    pypi
antlr4-python3-runtime    4.8                      pypi_0    pypi
anyio                     3.6.1                    pypi_0    pypi
argon2-cffi               21.3.0                   pypi_0    pypi
argon2-cffi-bindings      21.2.0                   pypi_0    pypi
asttokens                 2.0.5                    pypi_0    pypi
async-timeout             4.0.2                    pypi_0    pypi
attrs                     21.4.0                   pypi_0    pypi
babel                     2.10.3                   pypi_0    pypi
backcall                  0.2.0                    pypi_0    pypi
beautifulsoup4            4.11.1                   pypi_0    pypi
black                     22.6.0                   pypi_0    pypi
blas                      2.115                       mkl    conda-forge
blas-devel                3.9.0            15_linux64_mkl    conda-forge
bleach                    5.0.1                    pypi_0    pypi
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2022.6.15            ha878542_0    conda-forge
cachetools                5.2.0                    pypi_0    pypi
certifi                   2022.6.15                pypi_0    pypi
cffi                      1.15.1                   pypi_0    pypi
charset-normalizer        2.1.0                    pypi_0    pypi
click                     8.1.3                    pypi_0    pypi
cudatoolkit               11.1.1              ha002fc5_10    conda-forge
cycler                    0.11.0                   pypi_0    pypi
debugpy                   1.6.2                    pypi_0    pypi
decorator                 5.1.1                    pypi_0    pypi
defusedxml                0.7.1                    pypi_0    pypi
deprecation               2.1.0                    pypi_0    pypi
einops                    0.3.2                    pypi_0    pypi
entrypoints               0.4                      pypi_0    pypi
executing                 0.8.3                    pypi_0    pypi
fastjsonschema            2.15.3                   pypi_0    pypi
fonttools                 4.34.4                   pypi_0    pypi
freetype                  2.10.4               h0708190_1    conda-forge
freetype-py               2.3.0                    pypi_0    pypi
frozenlist                1.3.0                    pypi_0    pypi
fsspec                    2022.5.0                 pypi_0    pypi
future                    0.18.2                   pypi_0    pypi
giflib                    5.2.1                h36c2ea0_2    conda-forge
glob2                     0.7                      pypi_0    pypi
google-auth               2.9.0                    pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
grpcio                    1.47.0                   pypi_0    pypi
idna                      3.3                      pypi_0    pypi
imageio                   2.9.0                    pypi_0    pypi
imageio-ffmpeg            0.4.2                    pypi_0    pypi
importlib-metadata        4.12.0                   pypi_0    pypi
importlib-resources       5.8.0                    pypi_0    pypi
ipdb                      0.13.9                   pypi_0    pypi
ipykernel                 6.15.1                   pypi_0    pypi
ipython                   8.4.0                    pypi_0    pypi
ipython-genutils          0.2.0                    pypi_0    pypi
ipywidgets                7.7.1                    pypi_0    pypi
jedi                      0.18.1                   pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
joblib                    1.1.0                    pypi_0    pypi
jpeg                      9e                   h166bdaf_2    conda-forge
json5                     0.9.8                    pypi_0    pypi
jsonschema                4.7.2                    pypi_0    pypi
jupyter                   1.0.0                    pypi_0    pypi
jupyter-client            7.3.4                    pypi_0    pypi
jupyter-console           6.4.4                    pypi_0    pypi
jupyter-core              4.11.1                   pypi_0    pypi
jupyter-packaging         0.12.2                   pypi_0    pypi
jupyter-server            1.18.1                   pypi_0    pypi
jupyterlab                3.4.3                    pypi_0    pypi
jupyterlab-pygments       0.2.2                    pypi_0    pypi
jupyterlab-server         2.15.0                   pypi_0    pypi
jupyterlab-widgets        1.1.1                    pypi_0    pypi
kiwisolver                1.4.3                    pypi_0    pypi
kornia                    0.4.1                    pypi_0    pypi
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
lerc                      3.0                  h9c3ff4c_0    conda-forge
libblas                   3.9.0            15_linux64_mkl    conda-forge
libcblas                  3.9.0            15_linux64_mkl    conda-forge
libdeflate                1.12                 h166bdaf_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.1.0              h8d9b700_16    conda-forge
libgfortran-ng            12.1.0              h69a702a_16    conda-forge
libgfortran5              12.1.0              hdcd56e2_16    conda-forge
libgomp                   12.1.0              h8d9b700_16    conda-forge
liblapack                 3.9.0            15_linux64_mkl    conda-forge
liblapacke                3.9.0            15_linux64_mkl    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libpng                    1.6.37               h753d276_3    conda-forge
libstdcxx-ng              12.1.0              ha89aaad_16    conda-forge
libtiff                   4.4.0                hc85c160_1    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libuv                     1.43.0               h7f98852_0    conda-forge
libwebp                   1.2.2                h3452ae3_0    conda-forge
libwebp-base              1.2.2                h7f98852_1    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libzlib                   1.2.12               h166bdaf_1    conda-forge
llvm-openmp               14.0.4               he0ac6c6_0    conda-forge
llvmlite                  0.35.0                   pypi_0    pypi
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
markdown                  3.3.7                    pypi_0    pypi
markupsafe                2.1.1                    pypi_0    pypi
matplotlib                3.5.2                    pypi_0    pypi
matplotlib-inline         0.1.3                    pypi_0    pypi
mistune                   0.8.4                    pypi_0    pypi
mkl                       2022.1.0           h84fe81f_915    conda-forge
mkl-devel                 2022.1.0           ha770c72_916    conda-forge
mkl-include               2022.1.0           h84fe81f_915    conda-forge
multidict                 6.0.2                    pypi_0    pypi
mypy-extensions           0.4.3                    pypi_0    pypi
nbclassic                 0.4.3                    pypi_0    pypi
nbclient                  0.6.6                    pypi_0    pypi
nbconvert                 6.5.0                    pypi_0    pypi
nbformat                  5.4.0                    pypi_0    pypi
ncurses                   6.3                  h27087fc_1    conda-forge
nest-asyncio              1.5.5                    pypi_0    pypi
networkx                  2.8.4                    pypi_0    pypi
ninja                     1.11.0               h924138e_0    conda-forge
notebook                  6.4.12                   pypi_0    pypi
notebook-shim             0.1.0                    pypi_0    pypi
numba                     0.52.0                   pypi_0    pypi
numpy                     1.23.1           py38h3a7f9d9_0    conda-forge
oauthlib                  3.2.0                    pypi_0    pypi
omegaconf                 2.1.1                    pypi_0    pypi
open3d                    0.13.0                   pypi_0    pypi
opencv-python             4.5.3.56                 pypi_0    pypi
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   3.0.5                h166bdaf_0    conda-forge
packaging                 21.3                     pypi_0    pypi
pandas                    1.4.3                    pypi_0    pypi
pandocfilters             1.5.0                    pypi_0    pypi
parso                     0.8.3                    pypi_0    pypi
pathspec                  0.9.0                    pypi_0    pypi
pexpect                   4.8.0                    pypi_0    pypi
pickleshare               0.7.5                    pypi_0    pypi
pillow                    9.2.0            py38h0ee0e06_0    conda-forge
pip                       22.1.2             pyhd8ed1ab_0    conda-forge
platformdirs              2.5.2                    pypi_0    pypi
plyfile                   0.7.2                    pypi_0    pypi
prometheus-client         0.14.1                   pypi_0    pypi
prompt-toolkit            3.0.30                   pypi_0    pypi
protobuf                  3.19.4                   pypi_0    pypi
psutil                    5.9.1                    pypi_0    pypi
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0                    pypi_0    pypi
pure-eval                 0.2.2                    pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pycparser                 2.21                     pypi_0    pypi
pydeprecate               0.3.1                    pypi_0    pypi
pyglet                    1.5.26                   pypi_0    pypi
pygments                  2.12.0                   pypi_0    pypi
pymcubes                  0.1.2                    pypi_0    pypi
pyopengl                  3.1.0                    pypi_0    pypi
pyparsing                 3.0.9                    pypi_0    pypi
pyquaternion              0.9.9                    pypi_0    pypi
pyrender                  0.1.45                   pypi_0    pypi
pyrsistent                0.18.1                   pypi_0    pypi
python                    3.8.13          ha86cf86_0_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.8                      2_cp38    conda-forge
pytorch-lightning         1.5.4                    pypi_0    pypi
pytorch-ranger            0.1.1                    pypi_0    pypi
pytz                      2022.1                   pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
pyzmq                     23.2.0                   pypi_0    pypi
qtconsole                 5.3.1                    pypi_0    pypi
qtpy                      2.1.0                    pypi_0    pypi
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rsa                       4.8                      pypi_0    pypi
scikit-learn              1.1.1                    pypi_0    pypi
scipy                     1.8.1                    pypi_0    pypi
send2trash                1.8.0                    pypi_0    pypi
setuptools                63.1.0           py38h578d9bd_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sniffio                   1.2.0                    pypi_0    pypi
soupsieve                 2.3.2.post1              pypi_0    pypi
sqlite                    3.39.0               h4ff8645_0    conda-forge
stack-data                0.3.0                    pypi_0    pypi
tbb                       2021.5.0             h924138e_1    conda-forge
tensorboard               2.9.1                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
terminado                 0.15.0                   pypi_0    pypi
test-tube                 0.7.5                    pypi_0    pypi
threadpoolctl             3.1.0                    pypi_0    pypi
tinycss2                  1.1.1                    pypi_0    pypi
tk                        8.6.12               h27826a3_0    conda-forge
toml                      0.10.2                   pypi_0    pypi
tomli                     2.0.1                    pypi_0    pypi
tomlkit                   0.11.1                   pypi_0    pypi
torch                     1.8.1+cu111              pypi_0    pypi
torch-optimizer           0.3.0                    pypi_0    pypi
torchaudio                0.8.1                    pypi_0    pypi
torchmetrics              0.9.2                    pypi_0    pypi
torchvision               0.9.1+cu111              pypi_0    pypi
tornado                   6.2                      pypi_0    pypi
tqdm                      4.64.0                   pypi_0    pypi
traitlets                 5.3.0                    pypi_0    pypi
trimesh                   3.12.7                   pypi_0    pypi
typing_extensions         4.3.0              pyha770c72_0    conda-forge
urllib3                   1.26.10                  pypi_0    pypi
wcwidth                   0.2.5                    pypi_0    pypi
webencodings              0.5.1                    pypi_0    pypi
websocket-client          1.3.3                    pypi_0    pypi
werkzeug                  2.1.2                    pypi_0    pypi
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
widgetsnbextension        3.6.1                    pypi_0    pypi
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yarl                      1.7.2                    pypi_0    pypi
zipp                      3.8.0                    pypi_0    pypi
zlib                      1.2.12               h166bdaf_1    conda-forge
zstd                      1.5.2                h8a70e8d_2    conda-forge
ybbbbt commented 2 years ago

Here is my conda environment: environment.yml.txt

TxT1212 commented 2 years ago

Hi, your guess is right. There seems to be a pillow version problem maybe due to my conda source. By default, my pillow versoin is 9.2.0. When I downgrade it to 6.2.2, problem solved. Thanks a lot and congrats on your recent impressive works. Looking forward to the code of Neural Rendering in a Room.