Closed Islaster closed 1 year ago
Hi @Islaster
It looks like no meshes are extracted.
main
branch?just did a pull, what did you mean by config and checkpoint?
after the pull i re-ran the extraction and i got the same error
CHECKPOINT=logs/example_group/example_name/epoch_00000_iteration_000000010_checkpoint.pt CONFIG=logs/example_group/example_name/config.yaml
the issue you sent to is for training im having problems with the mesh extraction
Hi @Islaster
it's not letting me copy it into the comment so I'm going to send google drive links
Thanks @Islaster
I can confirm that this is a bug potentially related to #83 where your checkpoint saved is corrupted and all the SDF values predicted are NaN
. Therefore, no meshes are found.
so how do i fix it?
Can you provide more information on the following?
save_iter
is 20000. But your checkpoint is saved at iteration 10. How did you save this checkpoint?EXPERIMENT=template
GROUP=example_group
NAME=example_name
CONFIG=projects/neuralangelo/configs/custom/${EXPERIMENT}.yaml
GPUS=1
torchrun --nproc_per_node=${GPUS} train.py \
--logdir=logs/${GROUP}/${NAME} \
--config=${CONFIG} \
--show_pbar
pytorch version 2.0.0 and I changed the max iter in base.yaml
I can send you my whole git clone if that would help clear things up
Hi @Islaster
I was able to run your entire pipeline with torch version 1.12.1. Here is the mesh extracted at iteration 10 using the Lego dataset.
Can you try again with my torch version?
will do
what version of anaconda do you use?
I am using conda 4.10.1. We have also pushed a latest update of the code that could potentially fix the issue. Can you pull the latest version and try it out on your end?
I'm getting conflict errors when downloading Pytorch 1.12.1 could it have something to do with the anaconda envs files for neuralangelo
I had the same problem, is it because of too few iterations for training? I changed the configuration items save_epoch, max_epoch and max_iter in config_base.yaml to 3. Are there any minimum requirements for these values?
Hi @lie12huo
@mli0603 I used the latest code for training and model extraction, but the data preprocessing did not use the latest version of the code. Does this affect the result? How do I check if there is a problem with the saved checkpoint? By the way, I am currently using the test data toy_example.mov.
@mli0603 I reprocessed the data for training, generated new checkpoints, and exported the mesh, but the result was poor and there was no texture. Is this because 50 epochs are too few? Or is it related to my adjustment of parameters dict_size: 21, dim: 4?
SOFTWARE:
HARDWARE: