nv-tlabs / GET3D

Other
4.18k stars 374 forks source link

can't train in docker,my configuration has mistakes? #79

Closed Tom0072 closed 1 year ago

Tom0072 commented 1 year ago

Here is my command:

`root@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854 ==> launch training

Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "sroot@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854 ==> launch training

Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "channel_base": 32768, "channel_max": 512, "architecture": "skip" }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "gamma_mask": 40.0, "r1_gamma": 40.0, "style_mixing_prob": 0.9, "pl_weight": 0.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "inference_vis": false, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "/workspace/targetdata_1/img/02958343", "use_labels": false, "max_size": 1854, "xflip": false, "resolution": 1024, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "camera_path": "/workspace/targetdata_1/camera", "split": "train", "random_seed": 0 }, "resume_pretrain": null, "D_reg_interval": 16, "num_gpus": 1, "batch_size": 32, "batch_gpu": 4, "metrics": [ "fid50k" ], "total_kimg": 20000, "kimg_per_tick": 1, "image_snapshot_ticks": 50, "network_snapshot_ticks": 200, "random_seed": 0, "ema_kimg": 10.0, "G_reg_interval": 4, "run_dir": "/workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40" }

Output directory: /workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: /workspace/targetdata_1/img/02958343 Dataset size: 1854 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False

Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854

Num images: 1854 Image shape: [3, 1024, 1024] Label shape: [0]

Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Initializing logs... Training for 20000 kimg...

tick 0 kimg 0.0 time 27s sec/tick 14.2 sec/kimg 444.28 maintenance 12.4
==> start visualization /workspace/GET3D/training/networks_get3d.py:430: UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end). camera_theta = torch.range(0, n_camera - 1, device=self.device).unsqueeze(dim=-1) / n_camera math.pi 2.0 ==> saved visualization Evaluating metrics... ====> use validation set ==> use shapenet dataset ==> use shapenet folder number 0 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 0 ==> preparing the cache for fid scores {'class_name': 'training.dataset.ImageFolderDataset', 'path': '/workspace/targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': '/workspace/targetdata_1/camera', 'split': 'val', 'random_seed': 0} 0it [00:01, ?it/s] Traceback (most recent call last): File "train_3d.py", line 330, in main() # pylint: disable=no-value-for-parameter File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, kwargs) File "train_3d.py", line 324, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train_3d.py", line 103, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "train_3d.py", line 49, in subprocess_fn training_loop_3d.training_loop(rank=rank, c) File "/workspace/GET3D/training/training_loop_3d.py", line 407, in training_loop result_dict = metric_main.calc_metric( File "/workspace/GET3D/metrics/metric_main.py", line 52, in calc_metric results = _metric_dictmetric File "/workspace/GET3D/metrics/metric_main.py", line 145, in fid50k fid = frechet_inception_distance.compute_fid(opts, max_real=50000, num_gen=50000) File "/workspace/GET3D/metrics/frechet_inception_distance.py", line 30, in compute_fid mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset( File "/workspace/GET3D/metrics/metric_utils.py", line 165, in get_mean_cov mean = self.raw_mean / self.num_items TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' hapenet_car", "add_camera_cond": true, "channel_base": 32768, "channel_max": 512, "architecture": "skip" }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "gamma_mask": 40.0, "r1_gamma": 40.0, "style_mixing_prob": 0.9, "pl_weight": 0.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "inference_vis": false, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "/workspace/targetdata_1/img/02958343", "use_labels": false, "max_size": 1854, "xflip": false, "resolution": 1024, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "camera_path": "/workspace/targetdata_1/camera", "split": "train", "random_seed": 0 }, "resume_pretrain": null, "D_reg_interval": 16, "num_gpus": 1, "batch_size": 32, "batch_gpu": 4, "metrics": [ "fid50k" ], "total_kimg": 20000, "kimg_per_tick": 1, "image_snapshot_ticks": 50, "network_snapshot_ticks": 200, "random_seed": 0, "ema_kimg": 10.0, "G_reg_interval": 4, "run_dir": "/workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40" }

Output directory: /workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: /workspace/targetdata_1/img/02958343 Dataset size: 1854 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False

Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854

Num images: 1854 Image shape: [3, 1024, 1024] Label shape: [0]

Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Initializing logs... Training for 20000 kimg...

tick 0 kimg 0.0 time 27s sec/tick 14.2 sec/kimg 444.28 maintenance 12.4
==> start visualization /workspace/GET3D/training/networks_get3d.py:430: UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end). camera_theta = torch.range(0, n_camera - 1, device=self.device).unsqueeze(dim=-1) / n_camera math.pi 2.0 ==> saved visualization Evaluating metrics... ====> use validation set ==> use shapenet dataset ==> use shapenet folder number 0 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 0 ==> preparing the cache for fid scores {'class_name': 'training.dataset.ImageFolderDataset', 'path': '/workspace/targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': '/workspace/targetdata_1/camera', 'split': 'val', 'random_seed': 0} 0it [00:01, ?it/s] Traceback (most recent call last): File "train_3d.py", line 330, in main() # pylint: disable=no-value-for-parameter File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, kwargs) File "train_3d.py", line 324, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train_3d.py", line 103, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "train_3d.py", line 49, in subprocess_fn training_loop_3d.training_loop(rank=rank, c) File "/workspace/GET3D/training/training_loop_3d.py", line 407, in training_loop result_dict = metric_main.calc_metric( File "/workspace/GET3D/metrics/metric_main.py", line 52, in calc_metric results = _metric_dictmetric File "/workspace/GET3D/metrics/metric_main.py", line 145, in fid50k fid = frechet_inception_distance.compute_fid(opts, max_real=50000, num_gen=50000) File "/workspace/GET3D/metrics/frechet_inception_distance.py", line 30, in compute_fid mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset( File "/workspace/GET3D/metrics/metric_utils.py", line 165, in get_mean_cov mean = self.raw_mean / self.num_items TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' `

Configure: RTX3090,ubuntu 22.04 , use docker install function.

1.I pull GET3D in workspace and download the pretrain pkl file. 2.I use a targetdata_1 folder to restore the img and camera file (cars),but it's not full of them. 3.In GET3D dir,use command ,python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 Show the error upwards.I change the "--camera_path=" to "--camera_path ",same error.And it's no use to delete cache/gan-metrics/pkl

I have tried "./" at every path ,the error changed. root@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=./workspace/LOG --data=./workspace/targetdata_1/img/02958343 --camera_path=./workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> ERROR!!!! THIS SHOULD ONLY HAPPEN WHEN USING INFERENCE ==> use image path: ./workspace/targetdata_1/img/02958343, num images: 1234 ==> launch training

Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "channel_base": 32768, "channel_max": 512, "architecture": "skip" }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "gamma_mask": 40.0, "r1_gamma": 40.0, "style_mixing_prob": 0.9, "pl_weight": 0.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "inference_vis": false, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "./workspace/targetdata_1/img/02958343", "use_labels": false, "max_size": 1234, "xflip": false, "resolution": 1024, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "camera_path": "./workspace/targetdata_1/camera", "split": "train", "random_seed": 0 }, "resume_pretrain": null, "D_reg_interval": 16, "num_gpus": 1, "batch_size": 32, "batch_gpu": 4, "metrics": [ "fid50k" ], "total_kimg": 20000, "kimg_per_tick": 1, "image_snapshot_ticks": 50, "network_snapshot_ticks": 200, "random_seed": 0, "ema_kimg": 10.0, "G_reg_interval": 4, "run_dir": "./workspace/LOG/00001-stylegan2-02958343-gpus1-batch32-gamma40" }

Output directory: ./workspace/LOG/00001-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: ./workspace/targetdata_1/img/02958343 Dataset size: 1234 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False

Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> ERROR!!!! THIS SHOULD ONLY HAPPEN WHEN USING INFERENCE ==> use image path: ./workspace/targetdata_1/img/02958343, num images: 1234

Num images: 1234 Image shape: [3, 1024, 1024] Label shape: [0]

Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Traceback (most recent call last): File "train_3d.py", line 330, in main() # pylint: disable=no-value-for-parameter File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, kwargs) File "train_3d.py", line 324, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train_3d.py", line 103, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "train_3d.py", line 49, in subprocess_fn training_loop_3d.training_loop(rank=rank, c) File "/workspace/GET3D/training/training_loop_3d.py", line 222, in training_loop grid_size, images, labels, masks = setup_snapshot_image_grid(training_set=training_set, inference=inference_vis) File "/workspace/GET3D/training/training_loop_3d.py", line 66, in setup_snapshot_image_grid images, labels, masks = zip([training_set[i][:3] for i in grid_indices]) File "/workspace/GET3D/training/training_loop_3d.py", line 66, in images, labels, masks = zip([training_set[i][:3] for i in grid_indices]) File "/workspace/GET3D/training/dataset.py", line 292, in getitem img = ori_img[:, :, :3][..., ::-1] TypeError: 'NoneType' object is not subscriptable. Still no use delete past pkl

Please Point out my mistake in configuration.Thank you.

Tom0072 commented 1 year ago

I made a print-text debug.It shows that the first time to print image numbers is right .But After '==> start visualization' or some middle code,The number is wrong.This two number is working from one function ,but the 2nd time is 0.

SteveJunGao commented 1 year ago

This is likely because this part:

Evaluating metrics...
====> use validation set
==> use shapenet dataset
==> use shapenet folder number 0
==> use image path: /workspace/targetdata_1/img/02958343, num images: 0

In this line, the model is trying to compute FID on the validation set, but since the number of images in the validation images is 0, so it can't compute the statistics on this dataset, you can try to avoid this error by passing: --use_shapenet_split 0, which will use the training set to compute the FID.

Tom0072 commented 1 year ago

**@SteveJunGao Thank you very much!it's training now.One more question,if I want to train such like plane or something else,which camera mode should I use? ==> preparing the cache for fid scores {'class_name': 'training.dataset.ImageFolderDataset', 'path': 'targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': 'targetdata_1/camera', 'split': 'all', 'random_seed': 0} 100%|##########| 29/29 [01:03<00:00, 2.20s/it] {"results": {"fid50k": 250.82438700588935}, "metric": "fid50k", "total_time": 1161.4733066558838, "total_time_str": "19m 21s", "num_gpus": 1, "snapshot_pkl": "network-snapshot-000000.pkl", "timestamp": 1672811686.5416462} ==> finished evaluate metrics tick 1 kimg 1.1 time 23m 52s sec/tick 237.9 sec/kimg 232.32 maintenance 1168.2 tick 2 kimg 2.1 time 27m 42s sec/tick 229.6 sec/kimg 224.20 maintenance 0.0
tick 3 kimg 3.1 time 31m 17s sec/tick 214.8 sec/kimg 209.74 maintenance 0.0

SteveJunGao commented 1 year ago

Glad to hear it works!

For training on additional categories, I recommend you add another data_camera_mode inside the code e.g. the most important place is camera sampling (check this function and make sure the distribution to sample the camera is the same as how you render the training data

YiX98 commented 1 year ago

Hello @Tom0072 ! I got an error same as your second error, which is:

File "/GET3D/training/dataset.py", line 292, in getitem img = ori_img[:, :, :3][..., ::-1] TypeError: 'NoneType' object is not subscriptable.

Could you please kindly share your solutions for this error?