Open throb081 opened 2 weeks ago
Thanks for the question. When everything appears black, it means the training has collapsed, generating empty 3D space. It looks like you're using only one GPU, which might result in an insufficient batch size for stable training. Here are some solutions:
You can also check the loss log to ensure the 3D space isn't empty. The metric.csv
file shows the loss for each iteration, including a term called train/loss_sparsity
. If this term is around 0.1000, it indicates empty 3D results. Initially, parameters aren't converged, so sparsity will be 0.1000 for a while, but it should rise to between 0.1 and 0.6 within 1000 iterations (using 8 GPUs). Hope this helps!
You can also check the loss log to ensure the 3D space isn't empty. The
metric.csv
file shows the loss for each iteration, including a term calledtrain/loss_sparsity
. If this term is around 0.1000, it indicates empty 3D results. Initially, parameters aren't converged, so sparsity will be 0.1000 for a while, but it should rise to between 0.1 and 0.6 within 1000 iterations (using 8 GPUs). Hope this helps!
metrics.csv this is my metrics.csv,it seems to match what you said.loss_sparsity is always 0.1 .If it indicate empty 3D results ,is there anything i can do to solve it ?
You can reduce the regularization loss weights. The sparsity and eikonal losses are regularization terms. However, excessive regularization might cause training to collapse, resulting in empty space output. Try setting sparsity
loss to 1 or eikonal
loss to 0.001 to see the difference. Let me know if you have any questions.
You can reduce the regularization loss weights. The sparsity and eikonal losses are regularization terms. However, excessive regularization might cause training to collapse, resulting in empty space output. Try setting
sparsity
loss to 1 oreikonal
loss to 0.001 to see the difference. Let me know if you have any questions.
Thank you for your reply。Do you mean change the item lambda_sparsity to 1 and lambda_eikonal to 0.001 at the config file(asd_mv_triplane_transformer_10k.yaml)?
Yes, please have a try.
Since we haven't tested on a single GPU, it's worth finding optimal hyperparameters for that setup. I can allocate one GPU for this experiment and will provide coefficients once it's done.
OK,Thankyou very much i will have a try .My device is A6000,i hope what i done can help you too !
Hi, try setting lambda_eikonal
to 0.01 while keeping lambda_sparsity
at 20. I've attached the log for the first few thousand iterations. If you see loss_sparsity
increasing from 0.1000 upwards, you can expect good results when training is completed.
metrics.csv
Did you mean setting lambda_eikonal to 0.001?The original value of lambda_eikonal is 0.01.
Hi, I've tested a new configuration for single GPU training. The only changes are:
lambda_sparsity: 40
lambda_eikonal: 0.001
The results look promising (see attached image). Let me know if you have any questions.
Hi, I've tested a new configuration for single GPU training. The only changes are:
- lambda_sparsity: 40
- lambda_eikonal: 0.001
The results look promising (see attached image). Let me know if you have any questions.
Hello,i set the lambda_eikonal: 0.001,and keep lambda_sparsity: 20 ,i train for about 10w steps .I found that the output images nearly same like this : I find nothing in my metrics.csv,Maybe it's because I interrupted the program。I will try to set lambda_sparsity: 40 lambda_eikonal: 0.001 and have a try again
Hello,i set lambda_sparsity: 40 and lambda_eikonal: 0.001。However,i check the outpus picture and find that they seem like black again...like these :
and here are my csv,it seems like train/loss_sparsity keep 0.1 after 3000steps metrics(复件).csv
may i have your contact information?here is my QQ:1923388926,for easier communication
Hello,i recently run the training code # CUDA_VISIBLE_DEVICES=0 python launch.py \
--config configs/multi-prompt_benchmark/asd_mv_triplane_transformer_10k.yaml \
--train \
system.prompt_processor.prompt_library="instant3d_17000_prompt_library"
however ,when i check the output file,ScaleDreamer/ScaleDreamer-main/outputs/asd_mv_triplane_100k/instant3d_17000_prompt_library@20240819-101117/save/it100000-val/A_distinguished_man_in_a_suit_is_giving_a_gripping_speech_at_a_corporate_conference/,i find that the image-grids are almost black like this : ![Uploading 屏幕截图 2024-09-02 153048.jpg…]() Do you know why ?