robotic-vision-lab / Learning-Implicitly-From-Spatial-Transformers-Network

Implicit deep neural network for single-view 3D reconstruction.
Apache License 2.0
8 stars 0 forks source link

Result #2

Open 2577624123 opened 7 months ago

2577624123 commented 7 months ago

Hello. Nice job! I wonder if there are some parameter problems here, just to be clear, I use the parameters in the code provided so far. Because the results I ran with the shapenet dataset were not as good as those in the paper, they were quite different. Such as a chair: image CD and IoU are big difference, relative to 9.20 and 52.70 in the paper. In addition, I also wonder why the accuracy is so low? Here are the parameters I used: from argparse import ArgumentParser

def get_args(): parser = ArgumentParser(description='Image_to_3D') parser.add_argument('--cuda', type=bool, default=True) parser.add_argument('--gpu', type=int, default=0) parser.add_argument('--plot_every_batch', type=int, default=10) parser.add_argument('--save_every_epoch', type=int, default=25) parser.add_argument('--save_after_epoch', type=int, default=1) parser.add_argument('--test_every_epoch', type=int, default=25) parser.add_argument('--load_pretrain', type=bool, default=True) parser.add_argument('--skip_train', action='store_true')

parser.add_argument('--viewnum', type=int, default=36)
parser.add_argument('--img_res', type=int, default=224)
parser.add_argument('--mcube_znum', type=int, default=128)
parser.add_argument('--test_pointnum', type=int, default=65536)
parser.add_argument('--chunk_s', type=int, default=0)
parser.add_argument('--chunk_l', type=int, default=217)

parser.add_argument('--chunk_id', type=int, default=0)
parser.add_argument('--chunk_num', type=int, default=4)

# Required. Model & Dataset identifier.
parser.add_argument('--model', type=str,
                    help='Full path of the model')
parser.add_argument('--dataset', type=str,
                    help='Full path of the dataset')

# Data augmentation
parser.add_argument('--random_h_flip', action='store_true')
parser.add_argument('--color_jitter', action='store_true')
parser.add_argument('--normalize', action='store_true')

# Model componenets
parser.add_argument('--point_decoder', action='store_true')
parser.add_argument('--warm_start', action='store_true')

parser.add_argument('--lr', type=float, default=0.0001)
parser.add_argument('--beta1', type=float, default=0.9)
parser.add_argument('--cam_batch_size', type=int, default=16)
parser.add_argument('--cam_lr', type=float, default=0.00005)
parser.add_argument('--train_batch_size', type=int, default=12)
parser.add_argument('--test_batch_size', type=int, default=1)
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--sampling_mode', type=str, default='weighted')
parser.add_argument('--exp_name', '-e', type=str, default='d2im+tGCN')
parser.add_argument('--eval_pred', action='store_true')
parser.add_argument('--supervise_proj', action='store_true')
parser.add_argument('--coarse_point_density', type=int, default=10000)
parser.add_argument('--sample_point_density', type=int, default=32768)
parser.add_argument('--sdf_max_dist', type=float, default=1.0)
parser.add_argument('--sdf_scale', type=float, default=1.0)

parser.add_argument('--weight_decay', type=float, default=1e-5)
parser.add_argument('--sigmas', type=float, nargs='+',
                    default=[0.003, 0.01, 0.07])
parser.add_argument('--sample_distribution', type=float, nargs='+',
                    default=[0.5, 0.49, 0.01])

parser.add_argument('--point_feat', type=int,
                    default=[128, 128, 256, 256, 256, 128, 128, 3], nargs='+',
                    help='Features for point decoder.')
parser.add_argument('--point_degree', type=int,
                    default=[2, 2, 2, 2, 2, 2, 64], nargs='+',
                    help='Upsample degrees for point decoder.')
parser.add_argument('--im_enc_layers', type=int,
                    default=[1, 1, 1, 1, 16, 32, 64, 128, 128], nargs='+',
                    help='Layer dimension for voxnet encoder.')

parser.add_argument('--n_decoder_pos', type=int, default=2)
parser.add_argument('--bb_min', type=float, default=-0.5,
                    help='Bounding box minimum.')
parser.add_argument('--bb_max', type=float, default=0.5,
                    help='Bounding box maximum.')
parser.add_argument('--vox_res', type=int, default=128,
                    help='Bounding box res.')

parser.add_argument(
    '--data_dir', default='/media/ippc-zq/T7 Shield/Datasets/shapenet/')
parser.add_argument(
    '--mesh_dir', default='/media/ippc-zq/T7 Shield/Datasets/shapenet/mesh/')
parser.add_argument(
    '--h5_dir', default='/media/ippc-zq/T7 Shield/Datasets/shapenet/sampled_points/')
parser.add_argument(
    '--cam_dir', default='/media/ippc-zq/T7 Shield/Datasets/shapenet/images/')
parser.add_argument(
    '--image_dir', default='/media/ippc-zq/T7 Shield/Datasets/shapenet/images/')
parser.add_argument('--catlist', type=str,
                    default=['03001627', '02691156', '04090263', '04379243', '02958343'],
                    nargs='+',
                    help='catagory list.')
# ['03001627', '02691156', '04090263', '04379243', '02958343']
# parser.add_argument('--model_dir', default='/work/06035/sami/maverick2/results/d2im/')
parser.add_argument(
    '--output_dir', default='/media/ippc-zq/T7 Shield/LIST/')
# parser.add_argument('--log', default='log.txt')
parser.add_argument('--test_cam_id', type=int,
                    default=2,
                    help='Cam id to test with.')
parser.add_argument('--test_gpu_id', type=int,
                    default=0,
                    help='GPU id to test with.')
parser.add_argument('--test_checkpoint', default='best_model_test.pt.tar')
parser.add_argument('--testlist_file',
                    default='./data/DISN_split/testlist_all.lst')

args = parser.parse_args()
# some selected chairs with details
with open(args.testlist_file, 'r') as f:
    lines = f.readlines()

# print(lines)
testlist = []
for l in lines[:30]:
    fn = l.strip()
    if not fn == '':
        fn = fn.split(' ')
        if fn[0] in args.catlist:
            testlist.extend(
                [{'cat_id': fn[0], 'shape_id':fn[1], 'cam_id':fn[2]}])

args.testlist = testlist
# args.catlist = ['03001627']
# args.catlist = ['03001627', '02691156', '02828884', '02933112', '03211117', '03636649', '03691459', '04090263', '04256520', '04379243', '04530566','02958343', '04401088']

args.checkpoint_dir = args.output_dir+args.exp_name+'/checkpoints/'
args.results_dir = args.output_dir+args.exp_name+'/'
args.log = args.output_dir+args.exp_name+'/log.txt'

return args

if name == 'main': args = get_args() print(len(args.testlist))

Everything else follows your code. I want to know if there are some parameter changes here, or changes in some other part of the code, for the shapeNet dataset.

I really need your help!

Looking forward to your recovery! Some results: image image image

XiaolinHe8 commented 7 months ago

Hello. I have similar problems when using the code you provided to train on shapenet and ShapeNetRendering. We found that sdf_loss hardly dropped during training. And we noticed that the preprocessed sdf value is relatively small, whether some additional training skills are needed. In addition, can you provide a pre-trained model for better testing? grateful!

samiarshad commented 6 months ago

Hello @2577624123 ,

Thanks a lot for your interest in our work. Could you please provide more information about the training, testing, evaluation procedure.

Did you use the all the classes and images for training and testing? Did you train the coarse prediction module separately or together? How many points did you use to evaluate the reconstructions?

Best

samiarshad commented 6 months ago

Hi @XiaolinHe8,

Thanks a lot for your interest in our work. Yes, SDF values are very small. We scaled the values by a factor of 10.0 during training.

Best

SWWdz commented 5 months ago

How to set the environment for the project? Did there has some related work?

A-cloud-bit commented 4 months ago

How to set the environment for the project? Did there has some related work?

2577624123 commented 2 months ago

Hello @2577624123 ,

Thanks a lot for your interest in our work. Could you please provide more information about the training, testing, evaluation procedure.

Did you use the all the classes and images for training and testing? Did you train the coarse prediction module separately or together? How many points did you use to evaluate the reconstructions?

Best

I am very sorry that I have put this project on hold because I am busy with other projects. Here is my reply:

  1. My training, testing and evaluation were all carried out in accordance with the code you provided, and the parameters were not changed, but in the evaluation, it was a class by class evaluation, and every perspective of every object was evaluated.
  2. In training, I used all the classes, and the images were 2000 from each class in your code. And then there's the rough prediction training, and I'm training the rough prediction alone, and then as you said, I freeze it, and then I move on to the second phase of training.
  3. The points used in the evaluation are the same as those provided by your code, -coarse_point_density 4096, ('--test_pointnum', type=int, default=65536) ('--coarse_point_density', type=int, default=10000) ('--sample_point_density', type=int, default=32768)

I currently have a question as to whether you can provide your trained model of rough prediction of ShapeNet data, because I suspect that there may have been something wrong with me in the first stage that led to poor training.

Looking forward to your reply!

Best.