swz30 / MPRNet

[CVPR 2021] Multi-Stage Progressive Image Restoration. SOTA results for Image deblurring, deraining, and denoising.
Other
1.16k stars 188 forks source link

training problem #138

Closed yanghan0617 closed 3 months ago

yanghan0617 commented 7 months ago

how can i solve this problem.

(pytorch_yh) I:\yh\code\MPR\Deraining>python train.py D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\optim\lr_scheduler.py:131: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_sch eduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). " ===> Start Epoch 1 End Epoch 101 ===> Loading datasets 1%|███▏ | 5/347 [00:07<08:45, 1.54s/it] Traceback (most recent call last): File "I:\yh\code\MPR\Deraining\train.py", line 111, in for i, data in enumerate(tqdm(train_loader), 0): File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\tqdm\std.py", line 1182, in iter for obj in iterable: File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data\dataloader.py", line 652, in next data = self._next_data() File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data\dataloader.py", line 692, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data_utils\fetch.py", line 52, in fetch return self.collate_fn(data) File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data_utils\collate.py", line 175, in default_collate return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data_utils\collate.py", line 175, in return [default_collate(samples) for samples in transposed] # Backwards compatibility. File "D:\Anaconda\envs\pytorch_yh\lib\site-packages\torch\utils\data_utils\collate.py", line 141, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [3, 244, 256] at entry 0 and [3, 104, 256] at entry 1

Feecuin commented 5 months ago

Hello, have you encountered this problem before? image

Feecuin commented 5 months ago

你好,你后面可以训了吗?你的损失是否会下降,我的降到18就不会动了,PSNR也上不去

Feecuin commented 5 months ago

image 老哥,那我这个是啥问题?我按照readme来,只是改了120行和121行,因为报错说不是一个张量,后面改成了这样 image 可以PSNR上不去,而且训练的很快,我数据集有2000张图的,我怀疑是不是没有正确读图?

Feecuin commented 5 months ago

我也是像你这样改的,我一开始也是不能训练,看的GitHub改的,和你这个一样,然后就可以了。你用的啥机器跑的,用的哪个数据集。 @. ---- Replied Message ---- From @.> Date 4/1/2024 14:33 To @.> Cc @.> , @.> Subject Re: [swz30/MPRNet] training problem (Issue #138) image.png (view on web) 老哥,那我这个是啥问题?我按照readme来,只是改了120行和121行,因为报错说不是一个张量,后面改成了这样 image.png (view on web) 可以PSNR上不去,而且训练的很快,我数据集有2000张图的,我怀疑是不是没有正确读图? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>

我在3090上跑的,数据集是kaggle上的夜间雨数据集, 老哥可以分享下github你看的那个吗?我翻了两遍都不知道咋改

Feecuin commented 5 months ago

老哥,我在yml那儿把save_images设置为True,可是我跑完并没有看到图片被保存,这是不是我读图那儿就有问题了

Feecuin commented 5 months ago

老哥可以看看你读数据那块的代码吗?看看你的图是按照咋样存放的,我是这样存放的没问题吧? image

Feecuin commented 5 months ago

他readme给了可以那个google云盘的,我下不了,看不到他数据集是按照什么格式存放的

Feecuin commented 5 months ago

代码可以运行,读取数据应该是没有问题的 @. ---- Replied Message ---- From @.> Date 4/1/2024 14:49 To @.> Cc @.> , @.> Subject Re: [swz30/MPRNet] training problem (Issue #138) 老哥可以看看你读数据那块的代码吗?看看你的图是按照咋样存放的,我是这样存放的没问题吧? image.png (view on web) —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>

我感觉是没读进去的,因为每五轮保存图片都是空的,而且我2000张图,一个epoch竟然几秒钟结束

Feecuin commented 5 months ago

老哥说啥?方便加个联系方式请教下不?

ak-karimzai commented 4 months ago

image 老哥,那我这个是啥问题?我按照readme来,只是改了120行和121行,因为报错说不是一个张量,后面改成了这样 image 可以PSNR上不去,而且训练的很快,我数据集有2000张图的,我怀疑是不是没有正确读图?

@Feecuin Hi, can you share please the details of training?

Feecuin commented 4 months ago

image 老哥,那我这个是啥问题?我按照readme来,只是改了120行和121行,因为报错说不是一个张量,后面改成了这样 image 可以PSNR上不去,而且训练的很快,我数据集有2000张图的,我怀疑是不是没有正确读图?

@Feecuin Hi, can you share please the details of training?

ok, but I still haven't succeeded in training image