autonomousvision / giraffe

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"
https://m-niemeyer.github.io/project-pages/giraffe/index.html
MIT License
1.23k stars 160 forks source link

Some questions on model training #2

Closed ralphthehacker closed 3 years ago

ralphthehacker commented 3 years ago

Hey, awesome work!

I had a few questions regarding training:

  1. What was the hardware used for training the models in the papers? And, given that hardware spec, what was the total size of the model when training?

  2. Currently, Giraffe operates at a maximum resolution of 256x256. What would you say are the main bottlenecks that make training at higher resolutions more challenging?

m-niemeyer commented 3 years ago

Hi @ralphthehacker , thanks a lot for your interest in our project!

  1. We did single GPU training, and we mainly used NVIDIA 1080 Ti and V100s. The GPU memory used in the default setup should be roughly <11GB for 64res and <16GB for 256res. The only exception is when training with more objects (e.g. on the clevr datasets).
  2. We didn't investigate this, so I cannot really say. One bottleneck is also the data, i.e. people mostly use face datasets for higher resolutions, but this was not the main focus of this work. Also, what could be promising is to also increase the resolution of the volume-rendered feature image, not only the final output image, but then, the GPU requirements increase.

I hope this helps a little. Good luck with your research!