KeyError: 'focal' - Githubissues

jay757425789 commented 1 year ago

I find that the generated json without a key of focal. File "app/trainer.py", line 43, in train train_dataset, test_dataset = prep_dataset(enable_lpips, args) File "/opt/data/private/code/LandMark/app/tools/train_utils.py", line 153, in prep_dataset train_dataset = dataset( File "/opt/data/private/code/LandMark/app/tools/dataloader/city_dataset.py", line 30, in init self.read_meta() File "/opt/data/private/code/LandMark/app/tools/dataloader/city_dataset.py", line 36, in read_meta meta = load_json_drone_data( File "/opt/data/private/code/LandMark/app/tools/dataloader/ray_utils.py", line 114, in load_json_drone_data focal = meta["focal"] * (10 / image_scale) KeyError: 'focal'

jay757425789 commented 1 year ago

Can you give me a contact method, WeChat or email, thank you very much.

FlushingCat commented 1 year ago

Hi! Glad to hear from you.

Regarding the focal problem, we provide a script that implements the conversion of the COLMAP database to JSON format poses, and supports the common parameter extraction of different camera models. This means that in a JSON file, focal will be divided into two parameters: fl_x and fl_y.

load_json_drone_data() method is suitable for simple pinhole camera models that use single focal parameter for input data.

The easiest way to solve this problem is to change focal = meta["focal"] in load_json_drone_data() to focal = meta["fl_x"], because fl_x=fl_y is usually true in most camera models.

Do not forget to adjust the parameter 10 / image_scale according to the downsampling ratio of the current images.

Regarding the image rendering problem, such image mutilation is usually due to a bounding box setup that is too small. The bounding box during training can be controlled by adjusting the --ub and --lb parameters.

To further help us resolve the issue together, you can attach the .txt configuration file you are using

jay757425789 commented 1 year ago

Thanks, there is my .txt dataroot = /opt/data/private/data/nerfdata/rgb datadir = images dataset_name = city expname = west_x5_tiny subfolder = [west] ndim = 1

lb = [-2.4,-2.4,-0.05] ub = [2.4,2.4,0.55]

add_nerf = 500

basedir = ./log

train_near_far = [1e-1, 4] render_near_far = [2e-1, 4] downsample_train = 1

n_iters = 50000 batch_size = 8192 render_batch_size = 16384

N_voxel_init = 2097156 # 1283 N_voxel_final = 1073741824 # 10243

upsamp_list = [2000,3000,4000,5500,7000] update_AlphaMask_list = [2000,4000]

N_vis = 5 # vis all testing images vis_every = 500

n_lamb_sigma = [16,16,16] n_lamb_sh = [48,48,48]

fea2denseAct = softplus

view_pe = 2 fea_pe = 2

L1_weight_inital = 8e-5 rm_weight_mask_thre = 1e-4

TV_weight_density = 0.1 TV_weight_app = 0.01

compute_extra_metrics = 1 run_nerf = 0 bias_enable = 1 white_bkgd = 1

FlushingCat commented 1 year ago

Hi!

The --ub and --lb values provided here by default are usually too small, and you can consider increasing them by 5~10 times in further experiments

The specific value can be adjusted according to the camera poses range calculated after loading the data

jay757425789 commented 1 year ago

Thanks, all values in lb and ub increasing by 5-10 times? lb = [-2.4,-2.4,-0.05] ub = [2.4,2.4,0.55]

to lb = [-24,-24,-0.5] ub = [24,24,5.5]

JAY奏专属 @.***

Original

From:"Zhou yuanzhen (Bruce)"< @.*** >;

Date:2023/8/1 10:16

To:"InternLandMark/LandMark"< @.*** >;

CC:"Jialiang Tang"< @. >;"Author"< @. >;

Subject:Re: [InternLandMark/LandMark] KeyError: 'focal' (Issue #1)

Hi!

The --ub and --lb values provided here by default are usually too small, and you can consider increasing them by 5~10 times in further experiments

The specific value can be adjusted according to the camera poses range calculated after loading the data

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

FlushingCat commented 1 year ago

Yes, that's what I mean

1690857172876

The specific value can be adjusted according to the poses bounds. Here is an example.

jay757425789 commented 1 year ago

My pose information: train poses bds tensor([-2.9262, -3.6487, -2.8328]) tensor([3.1265, 2.8131, 2.8410]) test poses bds tensor([-0.4393, -3.3211, -1.2280]) tensor([-0.2740, -2.8605, -1.2055]) So, the -lb and -ub should be set as: lb = [[-2.9262, -3.6487, -2.8328] ub = [3.1265, 2.8131, 2.8410] Is correct?

FlushingCat commented 1 year ago

In fact, the recommended size of the bounding box needs to be several times larger than the pose bound.

The exact same size does not work well in experiments.🤔

jay757425789 commented 1 year ago

I found that the generated .mp4 is none. How to generate it.

dhgras commented 1 year ago

Hi guys, I still haven't figured out how to set --lb and --ub specifically according to poses pounds. If poses pounds are as follows Is it reasonable for me to set lb = [22.02,7.50,-5140.63] ub = [571.05,209.79,-203.64] ?

FlushingCat commented 1 year ago

Hi! I just open a new issue thread for new problems

Codey-Chang commented 1 year ago

I have a question! I think the 10 in focal = meta["focal"] * (10 / image_scale) should be 1.0.I think that Focal length and image size change in the same proportion？

FlushingCat commented 1 year ago

That's correct! Focal length should always be re-scaled with the image size. Here we assume that the dataset(including the focal) is already downsampled at 10 by default ,the downsample_train in our default configs is also ‘10’ to ensure consistency. We will update a better dataloader recently to avoid ambiguity.

Codey-Chang commented 1 year ago

Awesome, thank you for your response. Regarding image rendering quality, I've been working with my own aerial dataset, and I've noticed that even after adjusting parameters such as "lb" and "ub," as well as "near" and "far," I'm able to achieve a PSNR of 20. However, the image quality hasn't improved with the increase in PSNR and still remains quite blurry. Do you have any suggestions for improvement? I'm looking forward to your response.

FlushingCat commented 1 year ago

Did this tread help with your question? https://github.com/InternLandMark/LandMark/issues/6 Also, recently I realized that a lot of aerial data is likely to include multiple cameras, especially with very different focal lengths. This is not currently supported by the city dataloader

Codey-Chang commented 1 year ago

I've already gone through all the responses #6 , and while the PSNR has improved, the quality of the rendered images is still quite blurry. I believe my dataset should be highly consistent with yours since I've used a low-altitude drone for aerial photography with only a single camera throughout. After multiple rounds of parameter adjustments, I've only seen an increase in PSNR. So, what do you think could be the reasons behind this? 049999_image_82

FlushingCat commented 1 year ago

According to the depth map, Grid branch did not learn the correct geometry, can you share the config file?🤔

Codey-Chang commented 1 year ago

yeah,of couse！ transforms_train.json config.txt

FlushingCat commented 1 year ago

Based on this config file, I have several suggestions that might work：

1.Using relu rather than softplus infea2denseAct = softplus is more suitable for this kind of dataset 2.Make the scene box enclose only the buildings on the ground as much as possible, especially the extent of the z-axis. It can be caculated based on the camera pose and flight altitude 3.Increase the value of the N_voxel and add resMode=[1,2,4] to use multi-resolution grid

And the 'add_nerf' can be commented out before verifying that the grid branch learns the geometric features correctly. Other settings can also refer to the config of the MatrixCity dataset. Hopefully, these methods will help, as different cases always have different optimal settings

Codey-Chang commented 1 year ago

okay，I will have a try.Regarding the second suggestion, I would like to ask how it is calculated. If my drone is flying at a height of 120m above the ground, based on the pose file above, should I set the range of the z-axis to [-120,1]?

FlushingCat commented 1 year ago

In fact, it depends on the coordinate system and scaling of the pose. 🤔For example, if the z-axis range of the camera pose is distributed around 0, and the scale of the space is consistent with the reality, that is, the ground is distributed around -120, then [-125,50] is appropriate, including a downward reservation of 5 meters and a building height of less than 50 meters.

The SH2000 coordinate system is usually used in the case we showed, in which case the ground is always distributed around 0. Hope this helps you

Codey-Chang commented 1 year ago

Thank you very much for your suggestion. Since my drone is overlooking all the buildings in the school, I think [-125,5] might be enough. Is this correct? Also I have a question if there is a scaling factor, does it have any specific impact on my range selection. If the scaling factor is 5, should I set it to [-25,1]? Looking forward to your reply

FlushingCat commented 1 year ago

If there is a scaling factor for the camera pose, then the scene will be scaled equally. This means that the scene box also needs to be scaled proportionally, so [-25,1] is suitable

Codey-Chang commented 1 year ago

This is the result after I changed it to [-125,15], and there has been a significant improvement compared to before. Then I observed the depth map and felt that it might be inaccurate. What do you think about the accuracy of this depth map?

FlushingCat commented 1 year ago

The image quality looks great! For the depth map, I think it is possible that the depth contrast under this view is relatively low. Views that are not perpendicular to the ground will generally have better depth maps, and you can refer to those images. 🤔

Also, I realized I had introduced a bug earlier lol. The upper boundary of the scene box should be comparable to the ground + building height, then the suitable z-axis should be [-125, -70] to avoid overfitting into the air.

Codey-Chang commented 1 year ago

lol.[-125,75] should be more reasonable, thank you very much for your answer. But currently, if you cancel the comment of add_nerf, the result will be very high blur. What is causing this problem? 0fd268f321132cadbfa3f46b2df5a96

InternLandMark / LandMark

KeyError: 'focal' #1