Training - Githubissues

Brightlcz commented 2 years ago

I would like to get the result of BANet+, but it looks like the pre-training time alone is over a week or more. So my question is (1)do other ablation experiments also retrain the model? (2)why pre-training is required, and then change to 512 patch fine-tuning, is there a big difference in performance?

pp00704831 commented 2 years ago

Hello,

Yes, all of our ablation studies use the same training strategy.
Since we find the performance drops with the difference between the training (256x256) and testing (1280x720) image size, we fine-tune with the larger patches (512x512) for the last 1,000 epochs; thus, this step is important.
"Improving Image Restoration by Revisiting Global Information Aggregation" also shows similar effects of the patch size.

It also takes about a week for us to train the BANet+, I will release the weights of BANet+ recently. Thank you for your interest!

pp00704831 commented 2 years ago

I have updated the testing code for BANet+.

Brightlcz commented 2 years ago

@pp00704831 Thanks for your timely reply. Retraining the model is too time-consuming, is it because of the difficulty of training that you didn't directly train larger patches（512x512）? By the way, why do you need to warm up ten times? https://github.com/pp00704831/BANet/blob/44d32544c686ae34eb1ed3843996573eadf0f7b8/predict_BANet_Plus_GoPro_test_results.py#L52 Why subtract and add 0.5 here? once again my thanks to you!

pp00704831 commented 2 years ago

Hi,

If we directly train with a large patch (512x512), it would task lots of time. We warm up the environment for some iterations since it would have some delay in the beginning. We transfer the values of images from (0 to1) to (-0.5 to 0.5) to let them be similar to the zero mean distribution.

Brightlcz commented 2 years ago

Hello. @pp00704831 When I fine-tune the model, there is a sharp drop in psnr, is this normal?

pp00704831 commented 2 years ago

It was not normal, we use batch size 8 to train, you could try it

vandana1302238 commented 1 year ago

Hi team, if i try to run pretrained.py, getting these error, kindly help sorting

W1019 19:59:57.527494 30472 warnings.py:109] C:repos\venv\lib\site-packages\albumentations\imgaug\transforms.py:222: FutureWarning: IAASharpen is deprecated. Please use Sharpen instead warnings.warn("IAASharpen is deprecated. Please use Sharpen instead", FutureWarning)

W1019 19:59:57.527494 30472 warnings.py:109] C:\repos\venv\lib\site-packages\albumentations\imgaug\transforms.py:165: FutureWarning: This augmentation is deprecated. Please use Emboss instead warnings.warn("This augmentation is deprecated. Please use Emboss instead", FutureWarning)

I1019 19:59:57.529473 30472 dataset.py:28] Subsampling buckets from 0 to 100, total buckets number is 100 I1019 19:59:57.530474 30472 dataset.py:71] Dataset has been created with 370 samples I1019 19:59:57.532444 30472 dataset.py:28] Subsampling buckets from 0 to 100, total buckets number is 100 I1019 19:59:57.532444 30472 dataset.py:71] Dataset has been created with 20 samples Epoch 0, lr 0.0001: 0%| | 0/2103 [00:03<?, ?it/s]Traceback (most recent call last): File "", line 1, in

Traceback (most recent call last): File "C:/repos/BANet-TIP-2022-main/pretrained.py", line 162, in trainer.train() File "C:/repos/BANet-TIP-2022-main/pretrained.py", line 48, in train File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main self._run_epoch(epoch) File "C:/repos/BANet-TIP-2022-main/pretrained.py", line 75, in _run_epoch exitcode = _main(fd, parent_sentinel) File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 126, in _main for data in tq: File "C:\repos\venv\lib\site-packages\tqdm\std.py", line 1182, in iter self = reduction.pickle.load(from_parent) EOFError: Ran out of input for obj in iterable: File "C:\repos\venv\lib\site-packages\torch\utils\data\dataloader.py", line 442, in iter return self._get_iterator() File "C:\repos\venv\lib\site-packages\torch\utils\data\dataloader.py", line 388, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "C:\repos\venv\lib\site-packages\torch\utils\data\dataloader.py", line 1043, in init w.start() File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 121, in start self._popen = self._Popen(self) File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 327, in _Popen return Popen(process_obj) File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\popen_spawn_win32.py", line 93, in init reduction.dump(process_obj, to_child) File "C:\Users\van\AppData\Local\Programs\Python\Python38\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_transforms..process'

Process finished with exit code 1

pp00704831 / BANet-TIP-2022

Training #2