Closed Tom0072 closed 1 year ago
I made a print-text debug.It shows that the first time to print image numbers is right .But After '==> start visualization' or some middle code,The number is wrong.This two number is working from one function ,but the 2nd time is 0.
This is likely because this part:
Evaluating metrics...
====> use validation set
==> use shapenet dataset
==> use shapenet folder number 0
==> use image path: /workspace/targetdata_1/img/02958343, num images: 0
In this line, the model is trying to compute FID on the validation set, but since the number of images in the validation images is 0, so it can't compute the statistics on this dataset, you can try to avoid this error by passing:
--use_shapenet_split 0
, which will use the training set to compute the FID.
**@SteveJunGao Thank you very much!it's training now.One more question,if I want to train such like plane or something else,which camera mode should I use?
==> preparing the cache for fid scores
{'class_name': 'training.dataset.ImageFolderDataset', 'path': 'targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': 'targetdata_1/camera', 'split': 'all', 'random_seed': 0}
100%|##########| 29/29 [01:03<00:00, 2.20s/it]
{"results": {"fid50k": 250.82438700588935}, "metric": "fid50k", "total_time": 1161.4733066558838, "total_time_str": "19m 21s", "num_gpus": 1, "snapshot_pkl": "network-snapshot-000000.pkl", "timestamp": 1672811686.5416462}
==> finished evaluate metrics
tick 1 kimg 1.1 time 23m 52s sec/tick 237.9 sec/kimg 232.32 maintenance 1168.2
tick 2 kimg 2.1 time 27m 42s sec/tick 229.6 sec/kimg 224.20 maintenance 0.0
tick 3 kimg 3.1 time 31m 17s sec/tick 214.8 sec/kimg 209.74 maintenance 0.0
Glad to hear it works!
For training on additional categories, I recommend you add another data_camera_mode
inside the code e.g. the most important place is camera sampling (check this function and make sure the distribution to sample the camera is the same as how you render the training data
Hello @Tom0072 ! I got an error same as your second error, which is:
File "/GET3D/training/dataset.py", line 292, in getitem img = ori_img[:, :, :3][..., ::-1] TypeError: 'NoneType' object is not subscriptable.
Could you please kindly share your solutions for this error?
Here is my command:
`root@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854 ==> launch training
Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "sroot@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854 ==> launch training
Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "channel_base": 32768, "channel_max": 512, "architecture": "skip" }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "gamma_mask": 40.0, "r1_gamma": 40.0, "style_mixing_prob": 0.9, "pl_weight": 0.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "inference_vis": false, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "/workspace/targetdata_1/img/02958343", "use_labels": false, "max_size": 1854, "xflip": false, "resolution": 1024, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "camera_path": "/workspace/targetdata_1/camera", "split": "train", "random_seed": 0 }, "resume_pretrain": null, "D_reg_interval": 16, "num_gpus": 1, "batch_size": 32, "batch_gpu": 4, "metrics": [ "fid50k" ], "total_kimg": 20000, "kimg_per_tick": 1, "image_snapshot_ticks": 50, "network_snapshot_ticks": 200, "random_seed": 0, "ema_kimg": 10.0, "G_reg_interval": 4, "run_dir": "/workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40" }
Output directory: /workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: /workspace/targetdata_1/img/02958343 Dataset size: 1854 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False
Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854
Num images: 1854 Image shape: [3, 1024, 1024] Label shape: [0]
Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Initializing logs... Training for 20000 kimg...
tick 0 kimg 0.0 time 27s sec/tick 14.2 sec/kimg 444.28 maintenance 12.4
main() # pylint: disable=no-value-for-parameter
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "train_3d.py", line 324, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train_3d.py", line 103, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "train_3d.py", line 49, in subprocess_fn
training_loop_3d.training_loop(rank=rank, c)
File "/workspace/GET3D/training/training_loop_3d.py", line 407, in training_loop
result_dict = metric_main.calc_metric(
File "/workspace/GET3D/metrics/metric_main.py", line 52, in calc_metric
results = _metric_dictmetric
File "/workspace/GET3D/metrics/metric_main.py", line 145, in fid50k
fid = frechet_inception_distance.compute_fid(opts, max_real=50000, num_gen=50000)
File "/workspace/GET3D/metrics/frechet_inception_distance.py", line 30, in compute_fid
mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset(
File "/workspace/GET3D/metrics/metric_utils.py", line 165, in get_mean_cov
mean = self.raw_mean / self.num_items
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
hapenet_car",
"add_camera_cond": true,
"channel_base": 32768,
"channel_max": 512,
"architecture": "skip"
},
"G_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"D_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"loss_kwargs": {
"class_name": "training.loss.StyleGAN2Loss",
"gamma_mask": 40.0,
"r1_gamma": 40.0,
"style_mixing_prob": 0.9,
"pl_weight": 0.0
},
"data_loader_kwargs": {
"pin_memory": true,
"prefetch_factor": 2,
"num_workers": 3
},
"inference_vis": false,
"training_set_kwargs": {
"class_name": "training.dataset.ImageFolderDataset",
"path": "/workspace/targetdata_1/img/02958343",
"use_labels": false,
"max_size": 1854,
"xflip": false,
"resolution": 1024,
"data_camera_mode": "shapenet_car",
"add_camera_cond": true,
"camera_path": "/workspace/targetdata_1/camera",
"split": "train",
"random_seed": 0
},
"resume_pretrain": null,
"D_reg_interval": 16,
"num_gpus": 1,
"batch_size": 32,
"batch_gpu": 4,
"metrics": [
"fid50k"
],
"total_kimg": 20000,
"kimg_per_tick": 1,
"image_snapshot_ticks": 50,
"network_snapshot_ticks": 200,
"random_seed": 0,
"ema_kimg": 10.0,
"G_reg_interval": 4,
"run_dir": "/workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40"
}
==> start visualization /workspace/GET3D/training/networks_get3d.py:430: UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end). camera_theta = torch.range(0, n_camera - 1, device=self.device).unsqueeze(dim=-1) / n_camera math.pi 2.0 ==> saved visualization Evaluating metrics... ====> use validation set ==> use shapenet dataset ==> use shapenet folder number 0 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 0 ==> preparing the cache for fid scores {'class_name': 'training.dataset.ImageFolderDataset', 'path': '/workspace/targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': '/workspace/targetdata_1/camera', 'split': 'val', 'random_seed': 0} 0it [00:01, ?it/s] Traceback (most recent call last): File "train_3d.py", line 330, in
Output directory: /workspace/LOG/00016-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: /workspace/targetdata_1/img/02958343 Dataset size: 1854 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False
Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> use shapenet folder number 78 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 1854
Num images: 1854 Image shape: [3, 1024, 1024] Label shape: [0]
Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Initializing logs... Training for 20000 kimg...
tick 0 kimg 0.0 time 27s sec/tick 14.2 sec/kimg 444.28 maintenance 12.4
main() # pylint: disable=no-value-for-parameter
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "train_3d.py", line 324, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train_3d.py", line 103, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "train_3d.py", line 49, in subprocess_fn
training_loop_3d.training_loop(rank=rank, c)
File "/workspace/GET3D/training/training_loop_3d.py", line 407, in training_loop
result_dict = metric_main.calc_metric(
File "/workspace/GET3D/metrics/metric_main.py", line 52, in calc_metric
results = _metric_dictmetric
File "/workspace/GET3D/metrics/metric_main.py", line 145, in fid50k
fid = frechet_inception_distance.compute_fid(opts, max_real=50000, num_gen=50000)
File "/workspace/GET3D/metrics/frechet_inception_distance.py", line 30, in compute_fid
mu_real, sigma_real = metric_utils.compute_feature_stats_for_dataset(
File "/workspace/GET3D/metrics/metric_utils.py", line 165, in get_mean_cov
mean = self.raw_mean / self.num_items
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
`
==> start visualization /workspace/GET3D/training/networks_get3d.py:430: UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end). camera_theta = torch.range(0, n_camera - 1, device=self.device).unsqueeze(dim=-1) / n_camera math.pi 2.0 ==> saved visualization Evaluating metrics... ====> use validation set ==> use shapenet dataset ==> use shapenet folder number 0 ==> use image path: /workspace/targetdata_1/img/02958343, num images: 0 ==> preparing the cache for fid scores {'class_name': 'training.dataset.ImageFolderDataset', 'path': '/workspace/targetdata_1/img/02958343', 'use_labels': False, 'max_size': None, 'xflip': False, 'resolution': 1024, 'data_camera_mode': 'shapenet_car', 'add_camera_cond': True, 'camera_path': '/workspace/targetdata_1/camera', 'split': 'val', 'random_seed': 0} 0it [00:01, ?it/s] Traceback (most recent call last): File "train_3d.py", line 330, in
Configure: RTX3090,ubuntu 22.04 , use docker install function.
1.I pull GET3D in workspace and download the pretrain pkl file. 2.I use a targetdata_1 folder to restore the img and camera file (cars),but it's not full of them. 3.In GET3D dir,use command ,python train_3d.py --outdir=/workspace/LOG --data=/workspace/targetdata_1/img/02958343 --camera_path=/workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 Show the error upwards.I change the "--camera_path=" to "--camera_path ",same error.And it's no use to delete cache/gan-metrics/pkl
I have tried "./" at every path ,the error changed. root@10e8a2093f54:/workspace/GET3D# python train_3d.py --outdir=./workspace/LOG --data=./workspace/targetdata_1/img/02958343 --camera_path=./workspace/targetdata_1/camera --gpus=1 --batch=32 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 ==> start ==> use shapenet dataset ==> ERROR!!!! THIS SHOULD ONLY HAPPEN WHEN USING INFERENCE ==> use image path: ./workspace/targetdata_1/img/02958343, num images: 1234 ==> launch training
Training options: { "G_kwargs": { "class_name": "training.networks_get3d.GeneratorDMTETMesh", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 8 }, "one_3d_generator": true, "n_implicit_layer": 1, "deformation_multiplier": 1.0, "use_style_mixing": true, "dmtet_scale": 1.0, "feat_channel": 16, "mlp_latent_channel": 32, "tri_plane_resolution": 256, "n_views": 1, "render_type": "neural_render", "use_tri_plane": true, "tet_res": 90, "geometry_type": "conv3d", "data_camera_mode": "shapenet_car", "channel_base": 32768, "channel_max": 512, "fused_modconv_default": "inference_only" }, "D_kwargs": { "class_name": "training.networks_get3d.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "channel_base": 32768, "channel_max": 512, "architecture": "skip" }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "gamma_mask": 40.0, "r1_gamma": 40.0, "style_mixing_prob": 0.9, "pl_weight": 0.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "inference_vis": false, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "./workspace/targetdata_1/img/02958343", "use_labels": false, "max_size": 1234, "xflip": false, "resolution": 1024, "data_camera_mode": "shapenet_car", "add_camera_cond": true, "camera_path": "./workspace/targetdata_1/camera", "split": "train", "random_seed": 0 }, "resume_pretrain": null, "D_reg_interval": 16, "num_gpus": 1, "batch_size": 32, "batch_gpu": 4, "metrics": [ "fid50k" ], "total_kimg": 20000, "kimg_per_tick": 1, "image_snapshot_ticks": 50, "network_snapshot_ticks": 200, "random_seed": 0, "ema_kimg": 10.0, "G_reg_interval": 4, "run_dir": "./workspace/LOG/00001-stylegan2-02958343-gpus1-batch32-gamma40" }
Output directory: ./workspace/LOG/00001-stylegan2-02958343-gpus1-batch32-gamma40 Number of GPUs: 1 Batch size: 32 images Training duration: 20000 kimg Dataset path: ./workspace/targetdata_1/img/02958343 Dataset size: 1234 images Dataset resolution: 1024 Dataset labels: False Dataset x-flips: False
Creating output directory... Launching processes... Setting up PyTorch plugin "upfirdn2d_plugin"... Done. Setting up PyTorch plugin "bias_act_plugin"... Done. Setting up PyTorch plugin "filtered_lrelu_plugin"... Done. Loading training set... ==> use shapenet dataset ==> ERROR!!!! THIS SHOULD ONLY HAPPEN WHEN USING INFERENCE ==> use image path: ./workspace/targetdata_1/img/02958343, num images: 1234
Num images: 1234 Image shape: [3, 1024, 1024] Label shape: [0]
Constructing networks... Setting up augmentation... Distributing across 1 GPUs... Setting up training phases... Exporting sample images... Traceback (most recent call last): File "train_3d.py", line 330, in
main() # pylint: disable=no-value-for-parameter
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "train_3d.py", line 324, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train_3d.py", line 103, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "train_3d.py", line 49, in subprocess_fn
training_loop_3d.training_loop(rank=rank, c)
File "/workspace/GET3D/training/training_loop_3d.py", line 222, in training_loop
grid_size, images, labels, masks = setup_snapshot_image_grid(training_set=training_set, inference=inference_vis)
File "/workspace/GET3D/training/training_loop_3d.py", line 66, in setup_snapshot_image_grid
images, labels, masks = zip([training_set[i][:3] for i in grid_indices])
File "/workspace/GET3D/training/training_loop_3d.py", line 66, in
images, labels, masks = zip( [training_set[i][:3] for i in grid_indices])
File "/workspace/GET3D/training/dataset.py", line 292, in getitem
img = ori_img[:, :, :3][..., ::-1]
TypeError: 'NoneType' object is not subscriptable.
Still no use delete past pkl
Please Point out my mistake in configuration.Thank you.