NVlabs / neuralangelo

Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
https://research.nvidia.com/labs/dir/neuralangelo/
Other
4.31k stars 387 forks source link

Error when extracting mesh #140

Open cthulhu-rises opened 10 months ago

cthulhu-rises commented 10 months ago

I can't tell what the issue is, it is acting like a directory or filename is missung based on the error, but I have enterd the information repeatedly.

error below:

(neuralangelo) cthulhu@AZATHOTH:/mnt/c/Users/AZATHOTH/documents/wsl/neuralangelo$ torchrun --nproc_per_node=${GPUS} projects/neuralangelo/scripts/extract_mesh.py \
>     --config=${CONFIG} \
>     --checkpoint=${CHECKPOINT} \
>     --output_file=${OUTPUT_MESH} \
>     --resolution=${RESOLUTION} \
>     --block_res=${BLOCK_RES}
(Setting affinity with NVML failed, skipping...)
Running mesh extraction with 1 GPUs.
Setup trainer.
Using random seed 0
model parameter count: 53,568,556
Initialize model weights using type: none, gain: None
Using random seed 0
Allow TensorFloat32 operations on supported devices
Loading checkpoint (local): logs/example_group/example_name/Test.pt
- Loading the model...
Done with loading the checkpoint.
Extracting surface at resolution 2048 2048 2048
vertices: 1994732
faces: 3928112
Traceback (most recent call last):
  File "projects/neuralangelo/scripts/extract_mesh.py", line 105, in <module>
    main()
  File "projects/neuralangelo/scripts/extract_mesh.py", line 100, in main
    os.makedirs(os.path.dirname(args.output_file), exist_ok=True)
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
FileNotFoundError: [Errno 2] No such file or directory: ''
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 4845) of binary: /home/cthulhu/miniconda3/envs/neuralangelo/bin/python
Traceback (most recent call last):
  File "/home/cthulhu/miniconda3/envs/neuralangelo/bin/torchrun", line 10, in <module>
    sys.exit(main())
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main
    run(args)
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/cthulhu/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
projects/neuralangelo/scripts/extract_mesh.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-10-11_18:37:07
  host      : AZATHOTH.
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 4845)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Genshin-Impact-king commented 10 months ago

Have you solved?I met same issue as yours.

cthulhu-rises commented 10 months ago

Yours is different, read your error message. I think I've seen your error on another post though.

Genshin-Impact-king commented 10 months ago

thank you for your information

cthulhu-rises commented 10 months ago

The issue for mine was I needed to put a filepath for the output file. it didn't like being in the top neuroangelo file. it worked fine when I set it to be in the directory with the checkpoint files.