How does this program work?

IIHybrid commented 1 year ago

I don´t understand how to use the commands and which program i need to use

Somebody can help me?

SteveJunGao commented 1 year ago

Hi @IIHybrid,

As mentioned in the README, this model needs to run in a linux environment, and the program to run it is Python (with all the packages install properly).

Let me know if you have any questions when following the README to run the script ;)

dav-ell commented 1 year ago

If we want to train the model on a new set of images, how do we do that? That seems like a first-step kind of thing that should be in the README.

I've currently started the docker container, downloaded the model to the cache:

# cache
cd /home/user/
git clone git@github.com:nv-tlabs/GET3D.git
cd GET3D; mkdir cache; cd cache
wget https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/metrics/inception-2015-12-05.pkl

# Docker
cd docker
chmod +x make_image.sh
./make_image.sh get3d:v1
docker run --gpus device=all -it --rm -v /home/user/GET3D:/home/user/GET3D -it get3d:v1 bash

# Now what?

I see the command here:

python train_3d.py --outdir=PATH_TO_LOG --data=PATH_TO_RENDER_IMG --camera_path PATH_TO_RENDER_CAMERA --gpus=8 --batch=32 --gamma=40 --data_camera_mode shapenet_car  --dmtet_scale 1.0  --use_shapenet_split 1  --one_3d_generator 1  --fp32 0

but there's no description of what PATH_TO_RENDER_IMG should be. Should it be a folder of images? A single image? A 3D model of some kind?

There's also no description of what PATH_TO_RENDER_CAMERA should be or how to get it. Is this a set of camera positions in a yaml taken from e.g. COLMAP? We should have an example if so.

This is a pretty amazing-looking repo, and some of us have the GPUs to train this, or can use Lambda Cloud to test it out inexpensively, but the docs make this very difficult to navigate.

SteveJunGao commented 1 year ago

Hi @dav-ell ,

Thanks for mentioning the confusion! To prepare the the dataset, you can follow this part to download the shapenet dataset and render it into the images. The PATH_TO_RENDER_IMG is the path to the rendered images you can obtain by running the rendering script we provided, and the PATH_TO_RENDER_CAMERA is the path to the cameras we saved in the rendering script.

Tom0072 commented 1 year ago

@dav-ell It takes me two days to solve a lot of mistakes.In order to let you meet less parameters wrong,you can see my issue https://github.com/nv-tlabs/GET3D/issues/79 This is my operation experience: 1.Change render_shapenet.py render engine from 'CYCLES' to 'BLENDER_EEVEE'.Except you have 8 GPU to render. 2.After finish rendering,going to train.The PATH_TO_RENDER_IMG should locate to 02954383 folder,but the PATH_TO_RENDER_CAMERA should locate to one level up folder.LIKE --data=targetdata_1/img/02958343 --camera_path targetdata_1/camera. targetdata_1 is a foler to restore render result(img and camera). 3.You should run docker like this: Add "--shm-size 32g" in your docker run command.32G is the docker's RAM Or you only has 64M RAM in your docker.And The Mount part is /workspace because after the docker start,the base dir is /workspace.After docker running,you can access in local machine.

dav-ell commented 1 year ago

Thanks @SteveJunGao and @Tom0072! Will try your updates and get back to you. I currently only have a machine with 2 A6000s, hopefully that will be enough. If not, I can get 8 A100s to test on.

SteveJunGao commented 1 year ago

Hi @dav-ell,

Yes, the model should also be abled to train with 2 A6000s, you can increase the number of batches per gpu if only having two gpu (e.g. setting --gpus=2 --batch=16). The performance might degrade a bit, but won't be too much

nv-tlabs / GET3D

How does this program work? #76