NVlabs / neuralangelo

Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
https://research.nvidia.com/labs/dir/neuralangelo/
Other
4.31k stars 387 forks source link

Mesh extract error #150

Open Iliceth opened 10 months ago

Iliceth commented 10 months ago

Noob question time: I succesfully train many iterations but sometimes extracting the mesh starts well, gives out the number of faces, colors, etc. as a result and then, instead of writing the file to disk and going back to the prompt, gives this:

/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/utils/data/dataloader.py:561: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 6, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 478) of binary: /home/user/miniconda3/envs/neuralangelo/bin/python
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/neuralangelo/bin/torchrun", line 10, in <module>
    sys.exit(main())
  File "/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 794, in main
    run(args)
  File "/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
    elastic_launch(
  File "/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/user/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
====================================================
projects/neuralangelo/scripts/extract_mesh.py FAILED
----------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
----------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-11-01_20:30:50
  host      : MathZillaSSv3.
  rank      : 0 (local_rank: 0)
  exitcode  : -9 (pid: 478)
  error_file: <N/A>
  traceback : Signal 9 (SIGKILL) received by PID 478
====================================================

Three questions about this:

Thanks!

Georg1986 commented 10 months ago

Hallo, ich habe da gleiche Problem. Nach etwas Suchen bekam ich die Info, dass es wohl an meinem RAM liegt. Aber ich habe auch noch nicht gefunden, wo ich das einstellen kann. Für eine Lösung wäre ich sehr dankbar.

Georg1986 commented 9 months ago

Wenn ich bei mir die RESOLUTION=2048 auf RESOLUTION=1024 ändere, dann erhalte ich ein Mesh. Aber die Qualität ist demensprechend schlecht.