pengzhiliang / Conformer

Official code for Conformer: Local Features Coupling Global Representations for Visual Recognition
Apache License 2.0
531 stars 87 forks source link

training question #13

Closed JhihJhe closed 3 years ago

JhihJhe commented 3 years ago

Thanks for your nice work! Here I encountered a question about training from scratch for custom data, the error message is shown as the following:

D:\dl\Conformer-main>python main.py --model Conformer_small_patch16 --data-set IMNET --batch-size 4 --lr 0.001 --num_workers 0 --data-path ./datasets/test/ --output_dir ./output/test/ --epochs 10 Not using distributed mode Namespace(aa='rand-m9-mstd0.5-inc1', batch_size=4, clip_grad=None, color_jitter=0.4, cooldown_epochs=10, cutmix=1.0, cutmix_minmax=None, data_path='./datasets/test/', data_set='IMNET', decay_epochs=30, decay_rate=0.1, device='cuda', dist_url='env://', distributed=False, drop=0.0, drop_block=None, drop_path=0.1, epochs=10, eval=False, evaluate_freq=1, finetune='', inat_category='name', input_size=224, lr=0.001, lr_noise=None, lr_noise_pct=0.67, lr_noise_std=1.0, min_lr=1e-05, mixup=0.8, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='Conformer_small_patch16', model_ema=True, model_ema_decay=0.99996, model_ema_force_cpu=False, momentum=0.9, num_workers=0, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='./output/test/', patience_epochs=10, pin_mem=True, recount=1, remode='pixel', repeated_aug=True, reprob=0.25, resplit=False, resume='', sched='cosine', seed=0, smoothing=0.1, start_epoch=0, train_interpolation='bicubic', warmup_epochs=5, warmup_lr=1e-06, weight_decay=0.05, world_size=1) Creating model: Conformer_small_patch16 number of params: 37673424 Start training Traceback (most recent call last): File "main.py", line 375, in main(args) File "main.py", line 335, in main set_training_mode=args.finetune == '' # keep in eval mode during finetuning File "D:\dl\Conformer-main\engine.py", line 30, in train_one_epoch for samples, targets in metric_logger.log_every(data_loader, print_freq, header): File "D:\dl\Conformer-main\utils.py", line 157, in log_every header, total_time_str, total_time / len(iterable))) ZeroDivisionError: float division by zero

Kindly for help, thanks!

pengzhiliang commented 3 years ago

It looks like a problem with the data set, did you load the data set correctly? You can use print(len(dataset_train)) to check.

JhihJhe commented 3 years ago

Thanks for your answer! Here is the check result, my test image data has 18 images.

image

The printed result also shows 18 images. My directory structure is as the same as yours: ./datasets/test/ train/ fail/ img1.jpg pass/ img2.jpg val/ fail/ img3.jpg pass/ img4.jpg

Thanks a lot!

zhaozhiyi11 commented 3 years ago

我可以在目标检测的网络上使用conformer吗?比如说centernet

pengzhiliang commented 3 years ago

@JhihJhe I'm sorry for the late reply. If it is not the problem of the dataset, I am not sure what the specific reason is. I suggest you use the ImageNet2012 dataset to test it.

pengzhiliang commented 3 years ago

@zhaozhiyi11 Of course you can use Conformer to replace the backbone of centernet, but I cannot guarantee its performance. If you have conducted an experiment, you are welcome to report the results. If you encountered a problem, I can also help solve it. Thanks!

2252033991 commented 1 year ago

性能

请问您的性能如何