InternLandMark / LandMark

Other
455 stars 38 forks source link

How to improve the training quality? #6

Open chufall opened 1 year ago

chufall commented 1 year ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00

output image is as following 499999_100_0001_0301

So the result is not good, how to adjust the parameters to improve it? any advice?

Thanks a lot!

Qc

jay757425789 commented 1 year ago

In practice, I have encountered the same problem. I have conducted many experiments with various parameters. However, the rendered images are very vague, especially the edges. Therefore, can you tell us how to improve the quality of generation?

BaiBF commented 1 year ago

Thank you very much for your outstanding work. I have a same question to ask. How to set proper training parameters especially bound(lb,ub). I have tried seveal experiments, but there are still no good results. so, Can you provide some suggestions for reference?

eveneveno commented 1 year ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00

output image is as following 499999_100_0001_0301

So the result is not good, how to adjust the parameters to improve it? any advice?

Thanks a lot!

Qc

Hi, as seen from your attached image, the network is not learning any meaningful geometry.

As you set the total iter to be 500000, I'd suggest to add_nerf to at least middle of the total training scheme and to ensure the first grid pre-training part is already good enough (instead of setting it as 10000).

Also, given the bounding box params of your scene (lb = [-400,-300,100], ub = [400,300,150]), the near_far = [1e-1, 4] is likely to be problematic. Try to modify these parameters :)

eveneveno commented 1 year ago

In practice, I have encountered the same problem. I have conducted many experiments with various parameters. However, the rendered images are very vague, especially the edges. Therefore, can you tell us how to improve the quality of generation?

It is likely to be case-by0case tuning. Could you provide some visual results for reference?

eveneveno commented 1 year ago

Thank you very much for your outstanding work. I have a same question to ask. How to set proper training parameters especially bound(lb,ub). I have tried seveal experiments, but there are still no good results. so, Can you provide some suggestions for reference?

You may inspect the obtained camera pose file, and estimate the approximate bounds of the scene by taking the minimum/maxiumum value from x,y,z coordinate and estiamate a padding region to bound the scene.

blackkkk1 commented 1 year ago

Thanks, your reply enlightened me too. So further questions, considering the bounding box parameters of the scene (lb = [-400,-300,100], ub = [400,300,150]), what value should be set to near_far, do you have any suggestions? @eveneveno

eveneveno commented 1 year ago

Thanks, your reply enlightened me too. So further questions, considering the bounding box parameters of the scene (lb = [-400,-300,100], ub = [400,300,150]), what value should be set to near_far, do you have any suggestions? @eveneveno

It would be helpful if you know where you cameras are located in the scene. Then you can estimate the "far" value by give a rough estimate about the distance between the camera and the looked-at ground. Given the specific value you provided in this example, I think some value around 300-500 could also be possible but it all depends on your camera viewing directions. The near can be set arbitrary small at the beginning while it may cause unwanted floaters in the air. You can adjust this "near" value by estimating the distance between the camera and closest object it might encounter in the scene, so that you can squeezing the scene bounding box and elimiate the floaters as much as possible. You can always optimize the value via several rounds trials :)

kam1107 commented 1 year ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00

output image is as following 499999_100_0001_0301

So the result is not good, how to adjust the parameters to improve it? any advice?

Thanks a lot!

Qc

Please also verify if your scene is truly enclosed by your provided lb ub. e.g. check if your scene's altitude is between 100~150 or if it is the altitude range of your camera.

chufall commented 1 year ago

@kam1107 @eveneveno Hi , thank you very your advise !

  1. I toke all the pictures with dji drone camera's altitude is 125, groud is 45, the highest building is 90 So , I should set the near far is [30, 90] ? is it reasonable ?

  2. I have set the lb and ub following the discussion of https://github.com/InternLandMark/LandMark/issues/3: Since the statistics of my images with the xml exported by ContextCapture are as the following:

    train poses bds tensor([-416.3679, -264.1993,  116.0867]) tensor([356.1255, 270.2213, 116.4618])
    test poses bds tensor([-339.3041, -138.9054,  116.3005]) tensor([124.7532, 172.5519, 116.3907])

    So I set the lb = [-400,-300,100] and ub = [400,300,150]. Is it reasonable?

By the way, I have runned the train two times:

  1. the fist epoch is 50000 same as the city.txt and the max psnr is 16.53 so
  2. I try the second time, and epoch is 500000 but the increase of psnr is not too much!

So I think the 50000 is enough , and the problem of my train is the parameter setting.

Thanks a lot ! I'll try !

Qc

kam1107 commented 1 year ago

In that case, you can consider change your lb to [-450, -350, 30] and ub to [400, 320, 130] as a start point. The scene bounding box should enclose the camera motion range, e.g. in your case, camera's moving range along x-axis is (-416, 357), we can estimably set the scene range to [-450, 400]

Near far plane in oblique photography cases do not matter much as we intersect the ray with the bounding box and sample with the scene. So you can set near to a relatively small value, say 1.0 and far to a larger value, like 1e3.

chufall commented 1 year ago

@kam1107 Thank you very much !

Qc

chufall commented 1 year ago

@kam1107 @eveneveno

Hi I have finished the training with the following changed parameters: lb=[-450,-350,30] ub=[400,320,130] train_near_far=[1,1e3] render_near_far=[1,1e3] epoch = 10000/50000 The last psnr reached 17.64 and test psnr reached 17.45, which have exceeded the result at the first time

I'll try to do the following traning to improve the result :

  1. change the epoch to 10000/110000 to check whether the psnr will incease.
  2. change the grid branch number

Do you have any further advice?

Thank you very much!

Qc

dhgras commented 1 year ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00 output image is as following 499999_100_0001_0301 So the result is not good, how to adjust the parameters to improve it? any advice? Thanks a lot! Qc

Hi, as seen from your attached image, the network is not learning any meaningful geometry.

As you set the total iter to be 500000, I'd suggest to add_nerf to at least middle of the total training scheme and to ensure the first grid pre-training part is already good enough (instead of setting it as 10000).

Also, given the bounding box params of your scene (lb = [-400,-300,100], ub = [400,300,150]), the near_far = [1e-1, 4] is likely to be problematic. Try to modify these parameters :)

Hello, I am experiencing a similar problem. So I'm wondering what the exact reason is for not learning meaningful geometry? Is it just because the correct near_far is not set?

dhgras commented 1 year ago

@kam1107 @eveneveno

Hi I have finished the training with the following changed parameters: lb=[-450,-350,30] ub=[400,320,130] train_near_far=[1,1e3] render_near_far=[1,1e3] epoch = 10000/50000 The last psnr reached 17.64 and test psnr reached 17.45, which have exceeded the result at the first time

I'll try to do the following traning to improve the result :

  1. change the epoch to 10000/110000 to check whether the psnr will incease.
  2. change the grid branch number

Do you have any further advice?

Thank you very much!

Qc

Like you, I modified the parameters according to these suggestions and tried it on several different datasets, but still couldn't get the same results as in the paper and demo videos. I'm looking at the code and other potentially important training parameters, but I don't have any ideas to optimize the training results.

blackkkk1 commented 1 year ago

您好,我已经使用具有以下配置的地标训练了我的数据集(来自 city.txt 的副本,纪元为 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

最后的训练结果为: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00 图片输出如下499999_100_0001_0301 现在结果不好,如何调整参数来改善呢?有什么建议吗? 多谢! 质量控制

您好,从您所附的图像中可以看出,网络没有学习任何有意义的几何形状。

当您将总 iter 设置为 500000 时,我建议将 add_nerf 至少添加到总训练方案的中间,并确保第一个网格预训练部分已经足够好(而不是仅仅将其设置为 10000)。

另外,考虑场景的边界框参数(lb = [-400,-300,100],ub = [400,300,150]),这near_far = [1e-1, 4]可能会出现问题。尝试修改这些参数:)

Hello, I followed your suggestion during training, but found that when the parameter n_iters is set to 110000, the effect of add_nerf set to 10000 is better than that of 20000, instead of setting add_nerf to half of the number of iterations as mentioned before. Do you have any suggestions for this?

kam1107 commented 1 year ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00 output image is as following 499999_100_0001_0301 So the result is not good, how to adjust the parameters to improve it? any advice? Thanks a lot! Qc

Hi, as seen from your attached image, the network is not learning any meaningful geometry. As you set the total iter to be 500000, I'd suggest to add_nerf to at least middle of the total training scheme and to ensure the first grid pre-training part is already good enough (instead of setting it as 10000). Also, given the bounding box params of your scene (lb = [-400,-300,100], ub = [400,300,150]), the near_far = [1e-1, 4] is likely to be problematic. Try to modify these parameters :)

Hello, I am experiencing a similar problem. So I'm wondering what the exact reason is for not learning meaningful geometry? Is it just because the correct near_far is not set?

According to our experience, parameters and configurations need to be adjusted for different scenes. Apart from scene bound and near far plane, you might also want to consider the upsampling strategy. E.g. if the targeted scene is relatively large, e.g. >1000 images at 1024x768 resolution, the example setting: batch_size= 81924, upsamp_list = [2000,3000,4000,5500,7000], update_AlphaMask_list = [2000,4000] might not be suitable since at most 81924x2000 pixels(or rays) are trained, whereas the whole dataset has at least 1024x768x1000, meaning that upsampling at such an early stage might cause problems. You might want to postpone the entire upsampling schedule; or as a start point, turn off the upsampling to debug (see if the model can learn meaningful geometry at a low resolution setting).

jay757425789 commented 1 year ago

@kam1107 @eveneveno Hi , thank you very your advise !

  1. I toke all the pictures with dji drone camera's altitude is 125, groud is 45, the highest building is 90 So , I should set the near far is [30, 90] ? is it reasonable ?
  2. I have set the lb and ub following the discussion of https://github.com/InternLandMark/LandMark/issues/3: Since the statistics of my images with the xml exported by ContextCapture are as the following:
train poses bds tensor([-416.3679, -264.1993,  116.0867]) tensor([356.1255, 270.2213, 116.4618])
test poses bds tensor([-339.3041, -138.9054,  116.3005]) tensor([124.7532, 172.5519, 116.3907])

So I set the lb = [-400,-300,100] and ub = [400,300,150]. Is it reasonable?

By the way, I have runned the train two times:

  1. the fist epoch is 50000 same as the city.txt and the max psnr is 16.53 so
  2. I try the second time, and epoch is 500000 but the increase of psnr is not too much!

So I think the 50000 is enough , and the problem of my train is the parameter setting.

Thanks a lot ! I'll try !

Qc

Hello, may I ask how to load the xml file?

eveneveno commented 1 year ago

Hello, may I ask how to load the xml file?

You can refer to https://drive.google.com/file/d/1Hilc93M2G0rLatz093REilrT6Rub8a5c/view see how to parse .xml file and turn them into json format.

jay757425789 commented 1 year ago

Hello, may I ask how to load the xml file?

You can refer to https://drive.google.com/file/d/1Hilc93M2G0rLatz093REilrT6Rub8a5c/view see how to parse .xml file and turn them into json format.

Thanks, but the json file I generated cannot be directly used for training, and many keys are missing. How do you solve it?

eveneveno commented 1 year ago

Hello, may I ask how to load the xml file?

You can refer to https://drive.google.com/file/d/1Hilc93M2G0rLatz093REilrT6Rub8a5c/view see how to parse .xml file and turn them into json format.

Thanks, but the json file I generated cannot be directly used for training, and many keys are missing. How do you solve it?

Which keys are missing? The CC exported xml files can differ a bit based on your selected setting. I'd suggest modifying the provided code snippet and debug a bit to support your file format.

dhgras commented 1 year ago

Hello, I am experiencing a similar problem. So I'm wondering what the exact reason is for not learning meaningful geometry? Is it just because the correct near_far is not set?

According to our experience, parameters and configurations need to be adjusted for different scenes. Apart from scene bound and near far plane, you might also want to consider the upsampling strategy. E.g. if the targeted scene is relatively large, e.g. >1000 images at 1024x768 resolution, the example setting: batch_size= 81924, upsamp_list = [2000,3000,4000,5500,7000], update_AlphaMask_list = [2000,4000] might not be suitable since at most 81924x2000 pixels(or rays) are trained, whereas the whole dataset has at least 1024x768x1000, meaning that upsampling at such an early stage might cause problems. You might want to postpone the entire upsampling schedule; or as a start point, turn off the upsampling to debug (see if the model can learn meaningful geometry at a low resolution setting).

Thank you for your response! However, what confuses me is that, according to this setting, why is the number of trained pixels 81924x2000? According to my understanding, isn't this "2000" the number of iterations for upsampling? Or is there another parameter that represents the upsampling factor?

eveneveno commented 1 year ago

Thank you for your response! However, what confuses me is that, according to this setting, why is the number of trained pixels 81924x2000? According to my understanding, isn't this "2000" the number of iterations for upsampling? Or is there another parameter that represents the upsampling factor?

Yes, 81924x2000 counts the number of pixels that have been trained before the time of first upsampling (at 2000 iteration).

jay757425789 commented 1 year ago

In my experiments, I'm having severe image blur issues, the generated images are shown as follows: image image Meanwhile, the image quality and PSNR hardly change with the training iterations. The experimental setups are as follows: bbox: (tensor([-3.4247, -2.1477, -0.7065], device='cuda:0'), tensor([3.9726, 4.5638, 5.0000], device='cuda:0')) alpha rest %0.330998

dataroot = /opt/data/private/data/nerfdata/ datadir = moni dataset_name = city expname = moni subfolder = [west] ndim = 1

lb = [-0.2, -0.8, -0.4] ub = [0.2, 0.8, 0.1]

add_nerf = 20000

basedir = ./log

train_near_far = [1e-1, 4] render_near_far = [2e-1, 4] downsample_train = 1

n_iters = 50000 batch_size = 8192 render_batch_size = 8192

N_voxel_init = 2097156 # 1283 N_voxel_final = 1073741824 # 10243

upsamp_list = [20000,30000,40000,55000,70000] update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images vis_every = 2500

n_lamb_sigma = [16,16,16] n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2 fea_pe = 2

L1_weight_inital = 8e-5 rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1 TV_weight_app = 0.01

compute_extra_metrics = 1 run_nerf = 0 bias_enable = 1 white_bkgd = 1 wandb = False add_timestamp=1

The size of my images is 640x512. Do you have any suggestions on how to improve generation governance?

eveneveno commented 1 year ago

, 5.0000

Given the bbox of z-axis is more than 5, I'll guessing the far field set to 4 is not enough to cover the ground region if viewed from the highest position. As a result the learned depth maps cannot reveal any geometry on the ground. Could you tune that param a bit and see if that help?

jay757425789 commented 1 year ago

, 5.0000

Given the bbox of z-axis is more than 5, I'll guessing the far field set to 4 is not enough to cover the ground region if viewed from the highest position. As a result the learned depth maps cannot reveal any geometry on the ground. Could you tune that param a bit and see if that help?

Thanks for you suggestion! However, the generated images are still blurred after changing the far field to 10. image

Do you have any other suggestions?

Demonss3 commented 1 year ago

@jay757425789 have you solved it? I meet the same problem!

blackkkk1 commented 1 year ago

您好,我已收到您的邮件。

Crush1111 commented 10 months ago

@jay757425789 have you solved it? I meet the same problem!

IndexError: index 0 is out of bounds for dimension 0 with size 0
88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 7/8 [04:31<00:38, 38.76s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 9999/10000 [1:17:46<00:00, 2.14it/s] Traceback (most recent call last):
File "/home/songjiali/LandMark-1/app/trainer.py", line 664, in
train(init_args)
File "/home/songjiali/LandMark-1/app/trainer.py", line 438, in train
psnrs_test = evaluation(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 140, in evaluation
allret, = renderer(
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 52, in renderer_fn
ret, extra_loss = gridnerf(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(
input, **kwargs)
File "/home/songjiali/LandMark-1/app/models/gridnerf_parallel.py", line 1119, in forward
(mask[self.args.part].sum(), sum(self.app_n_comp)),
IndexError: index 1 is out of bounds for dimension 0 with size 0 我在训练到添加nerf的阶段,最后输出的时候就会报这个错 ,您知道是什么原因吗?怎么解决?感谢

blackkkk1 commented 10 months ago

您好,我已收到您的邮件。

Crush1111 commented 10 months ago

, 5.0000

Given the bbox of z-axis is more than 5, I'll guessing the far field set to 4 is not enough to cover the ground region if viewed from the highest position. As a result the learned depth maps cannot reveal any geometry on the ground. Could you tune that param a bit and see if that help?

Thanks for you suggestion! However, the generated images are still blurred after changing the far field to 10. image

Do you have any other suggestions?

@jay757425789 have you solved it? I meet the same problem!

IndexError: index 0 is out of bounds for dimension 0 with size 0
88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 7/8 [04:31<00:38, 38.76s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 9999/10000 [1:17:46<00:00, 2.14it/s] Traceback (most recent call last):
File "/home/songjiali/LandMark-1/app/trainer.py", line 664, in
train(init_args)
File "/home/songjiali/LandMark-1/app/trainer.py", line 438, in train
psnrs_test = evaluation(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 140, in evaluation
allret, = renderer(
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 52, in renderer_fn
ret, extra_loss = gridnerf(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(
input, **kwargs)
File "/home/songjiali/LandMark-1/app/models/gridnerf_parallel.py", line 1119, in forward
(mask[self.args.part].sum(), sum(self.app_n_comp)),
IndexError: index 1 is out of bounds for dimension 0 with size 0 我在训练到添加nerf的阶段,最后输出的时候就会报这个错 ,您知道是什么原因吗?怎么解决?感谢

Crush1111 commented 10 months ago

Hi , I have trained my datasets using landmark with following configs ( copy from city.txt and epoch is 500000):

dataroot =  /home/u1616/dev/data/dingxiu
datadir = 0803
dataset_name = dingxiu
expname = dingxiu_0804
subfolder = []
ndim = 1

#lb = [-2.4,-2.4,-0.05]
#ub = [2.4,2.4,0.55]

lb = [-400,-300,100]
ub = [400,300,150]

#shrink = 1
add_nerf = 10000

basedir = ./log

train_near_far = [1e-1, 4]
render_near_far = [2e-1, 4]
downsample_train = 5

n_iters = 500000
batch_size = 81924
render_batch_size = 8192

N_voxel_init = 2097156 # 128**3
N_voxel_final = 1073741824 # 1024**3

upsamp_list = [2000,3000,4000,5500,7000]
update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images
vis_every = 5000

n_lamb_sigma = [16,16,16]
n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2
fea_pe = 2

L1_weight_inital = 8e-5
L1_weight_rest = 4e-5
rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1
TV_weight_app = 0.01

compute_extra_metrics = 1
run_nerf = 0
bias_enable = 1
white_bkgd = 1
sampling_opt = 0

The last train result is: psnr=17.35 test=17.15 mse=0.019 nerf_psnr = 16.95 psnr1 = 17.00

output image is as following 499999_100_0001_0301

So the result is not good, how to adjust the parameters to improve it? any advice?

Thanks a lot!

Qc

@jay757425789 have you solved it? I meet the same problem!

IndexError: index 0 is out of bounds for dimension 0 with size 0
88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ | 7/8 [04:31<00:38, 38.76s/it] 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 9999/10000 [1:17:46<00:00, 2.14it/s] Traceback (most recent call last):
File "/home/songjiali/LandMark-1/app/trainer.py", line 664, in
train(init_args)
File "/home/songjiali/LandMark-1/app/trainer.py", line 438, in train
psnrs_test = evaluation(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 140, in evaluation
allret, = renderer(
File "/home/songjiali/LandMark-1/app/tools/render_utils.py", line 52, in renderer_fn
ret, extra_loss = gridnerf(
File "/home/songjiali/anaconda3/envs/landmark1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(
input, **kwargs)
File "/home/songjiali/LandMark-1/app/models/gridnerf_parallel.py", line 1119, in forward
(mask[self.args.part].sum(), sum(self.app_n_comp)),
IndexError: index 1 is out of bounds for dimension 0 with size 0 我在训练到添加nerf的阶段,最后输出的时候就会报这个错 ,您知道是什么原因吗?怎么解决?感谢