Open JiuTongBro opened 8 months ago
We've tested our method on Carla. It can work, but the performance is not as satisfying as GRAM under 128^2. This may be because that the artifacts brought by limited number of manifold surfaces are hard to suppress under higher resolution and 360-degree viewpoint.
Thanks for you Reply!
I modified a GRAM-64 config and a GRAMHD-128 config by myself based on the GRAM code, and tried to train a GRAMHD for Carla. However, it fails even in the coarse GRAM-64 stage.
This is the coarse GRAM results after 100000 iteration's training.
I suppose perhaps I made some mistakes in my modified code? I didn't change anything in your version's GRAM code. I wonder, is your version's GRAM code the same as the original GRAM code? Is there anything else I need to modify to make the code sucessfully run on Carla Dataset?
This is the modified GRAM-64 config:
GRAM64_Carla = {
'global': {
'img_size': 64,
'batch_size': 4,
'z_dist': 'gaussian',
},
'optimizer': {
'gen_lr': 2e-5,
'disc_lr': 2e-4,
'sampling_network_lr': 2e-6,
'betas': (0, 0.9),
'grad_clip': 0.3,
},
'process': {
'class': 'Gan3DProcess',
'kwargs': {
'batch_split': 4,
'real_pos_lambda': 15.,
'r1_lambda': 1.,
'pos_lambda': 15.,
}
},
'generator': {
'class': 'GramGenerator',
'kwargs': {
'z_dim': 256,
'img_size': 64,
'h_stddev': math.pi,
'v_stddev': math.pi*(42.5/180),
'h_mean': math.pi*0.5,
'v_mean': math.pi*(42.5/180),
'sample_dist': 'spherical_uniform',
},
'representation': {
'class': 'gram',
'kwargs': {
'hidden_dim': 256,
'normalize': 2,
'sigma_clamp_mode': 'softplus',
'rgb_clamp_mode': 'widen_sigmoid',
'hidden_dim_sample': 256,
'layer_num_sample': 3,
'center': (0, 0, 0),
'init_radius': 0,
},
},
'renderer': {
'class': 'manifold_renderer',
'kwargs': {
'num_samples': 64,
'num_manifolds': 48,
'levels_start': 35,
'levels_end': 5,
'delta_alpha': 0.02,
'last_back': False,
'white_back': True,
}
}
},
'discriminator': {
'class': 'GramEncoderDiscriminator',
'kwargs': {
'img_size': 64,
}
},
'dataset': {
'class': 'CARLA',
'kwargs': {
'img_size': 64,
'real_pose': True,
}
},
'camera': {
'fov': 30,
'ray_start': 0.7,
'ray_end': 1.3,
}
}
And this is the modified GRAMHD-128 config:
GRAMHD128_Carla = {
'global': {
'img_size': 128,
'batch_size': 4,
'z_dist': 'gaussian',
},
'optimizer': {
'gen_lr': 2e-5,
'disc_lr': 2e-4,
'sampling_network_lr': 2e-6,
'betas': (0, 0.9),
'grad_clip': 0.3,
},
'process': {
'class': 'SRGan3DProcess',
'kwargs': {
'batch_split': 4,
'pos_lambda': 15.,
'real_pos_lambda': 15.,
'r1_lambda': 1.,
'cons_lambda': 3.,
'use_patch_d': True,
'patch_lambda': 0.1,
'r1_patch': True,
}
},
'generator': {
'class': 'GramHDGenerator',
'kwargs': {
'z_dim': 256,
'feature_dim': 32,
'img_size': 128,
'lr_img_size': 64,
'h_stddev': math.pi,
'v_stddev': math.pi*(42.5/180),
'h_mean': math.pi*0.5,
'v_mean': math.pi*(42.5/180),
'sample_dist': 'spherical_uniform',
'gram_model_file': 'out/carla_gram/step100000_generator.pth', # If you want to train your own model, set this to the stage1 GRAM model file
},
'representation': {
'class': 'gram',
'kwargs': {
'hidden_dim': 256,
'normalize': 2,
'sigma_clamp_mode': 'softplus',
'rgb_clamp_mode': 'widen_sigmoid',
'hidden_dim_sample': 256,
'layer_num_sample': 3,
'center': (0, 0, 0),
'init_radius': 0,
},
},
'super_resolution': {
'class': 'styleesrgan',
'kwargs': {
'fg': {
'w_dim': 256,
'nf': 64,
'nb': 8,
'gc': 32,
'up_channels': [64,],
'to_rgb_ks': 1,
},
'bg': {
'nf': 64,
'nb': 4,
'gc': 32,
'up_channels': [64,],
'use_pixel_shuffle': False,
'global_residual': True
},
}
},
'renderer': {
'class': 'manifold_sr_renderer',
'kwargs': {
'num_samples': 64,
'num_manifolds': 48,
'levels_start': 35,
'levels_end': 5,
'delta_alpha': 0.02,
'last_back': False,
'white_back': True,
}
}
},
'discriminator': {
'class': 'GramEncoderPatchDiscriminator',
'kwargs': {
'img_size': 128,
'norm_layer': nn.Identity,
}
},
'dataset': {
'class': 'CARLA',
'kwargs': {
'img_size': 128,
'real_pose': True,
}
},
'camera': {
'fov': 30,
'ray_start': 0.7,
'ray_end': 1.3,
}
}
Would you kindly help me figure out the error? Thanks!
I checked my implementation and do found that I missed some configs used for Carla training. The sampling_network_lr
argument does not change the learning rate for the sampling network.
I think one option is to modify the code according to the GRAM code for this part. Another option is to directly use the 128^2 Carla checkpoint from GRAM (My experiment for Carla directly use GRAM's ckpt, that's why I didn't notice this issue).
Sincerely Thanks! I will try it.
Hi. Thanks for your suggestion.
I followed this pipeline to directly train a GRAMHD model based on the official GRAM-Carla-128 checkpoint.
However, the low-resolution generated images, which are produced by the low-resolution GRAM, seems to be worse than the inferecne results produced by the official GRAM code. I checked the low-res results generated in different epochs. They are all the same. So it is correct that the low-res GRAM is frozen during the HD training.
The only change I made, it's to remove the downsample in Cross-resolution consistency loss. As the HR image and LR image both has a resolution of 128:
if generator_ddp.module.scale_factor == 1.:
cons_penalty = self.cons_lambda * ((gen_imgs - lr_imgs)**2).mean()
cons_penalty += self.cons_lambda * ((sr_rgba - lr_rgba)**2).mean()
else:
cons_penalty = self.cons_lambda * ((bicubic_downsample(gen_imgs, generator_ddp.module.scale_factor) - lr_imgs)**2).mean()
cons_penalty += self.cons_lambda * ((bicubic_downsample(sr_rgba, generator_ddp.module.scale_factor) - lr_rgba)**2).mean()
I wonder do you know why this happens? And did noticed the same issue in your Carla training?
Thanks!😀
Hi. Thanks for your excellent work.
I wonder does GRAMHD support 360-degree object data, like Carla/Shapenet? I noticed GRAM achieved a satisfying result on Carla, so I am quite interested what it will be if we use GRAMHD. Have you tested on those datasets?😀