pclucas14 / lidar_generation

Code for "Deep Generative Models for LiDAR Data"
79 stars 20 forks source link

Unable to load pretrained GAN model #15

Closed aldipiroli closed 3 years ago

aldipiroli commented 3 years ago

Hello Lucas and thank you for your awesome work! I was trying to import your /trained_models/uncond_gan model, however I get the following error:

RuntimeError: Error(s) in loading state_dict for netG: Unexpected key(s) in state_dict: "main.0.weight_orig", "main.0.weight_u", "main.3.weight_orig", "main.3.weight_u", "main.6.weight_orig", "main.6.weight_u", "main.9.weight_orig", "main.9.weight_u", "main.12.weight_orig", "main.12.weight_u".

Could it be that the model netG in models.py is different from the one that you saved during training? I have also tried with the suggested torch version.

pclucas14 commented 3 years ago

Can you try gen.load_state_dict(torch.load('path_to_gen_weights.pth'), strict=False) ?

aldipiroli commented 3 years ago

That did the trick, thank you!

aldipiroli commented 3 years ago

I had couple of more questions regarding you approach to the unconditional generation (not sure if it is ok to ask them in thread, if not feel free to delete it).

In your paper you talked about using as input channels [x,y,z] (Cartesian) and said that this leads better results than [range, z] (Polar). However I noticed that in your training parameters for the GAN model you have used the polar coordinates instead of the cartesians. Did you have any particular reason for this choice, or as you suggested in the paper, the difference can be appreciated just when you have a noisy input ?

Regarding discriminator, I noticed that you transform in one step the rectangular input image (with Conv2d of of kernel size 1,16) into a square image. Same goes for the generator when doing the upampling. Did you have any particular reasons for doing this instead for example to do in multiple convolution steps (e.g. (1,4) -> (1,4), ..) ?

Regarding your function remove_zeros, where you use max pooling to remove zeros on the input image. Did you notice any improvements in using this preprocessing compared to not using it ?

Finally, I also noticed that you used the RMSprop optimized (again for the gan training). Did you find to perform better than Adam?

Best,

pclucas14 commented 3 years ago

From what I recall, polar representation leads to "smoother" lidars (this is a purely qualitative observation). So for the GAN experiments, since I just wanted the samples to look nice I went with polar.

Regarding the architecture, I think it's the first thing I tried (I modified the DCGAN pytorch code to work with my current dataset). If I were to redo the paper today I would probably do something smoother.

The preprocessing with max pooling was to "fill" slots which had no points, whether they were not recorded by the scanner or missing for other reasons. I don't recall exactly which preprocessing step was the most useful, but overall doing strong preprocessing had the biggest impact to improve the quality of the generations.

I could not tell you why I used RMSprop, sorry.

aldipiroli commented 3 years ago

Thank a lot for your answers!