Another size of image - Githubissues

brian411227 commented 2 years ago

Thank you for your research ! But now, I try to generate a new "checkpoint_512_celeba-hq.pt" for 512x512 size image. However, it still something wrong through test.py phase.

The error message shows that : RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

How can I do now ? Do you have the checkpoint_512_celeba-hq.pt file ?

imlixinyang commented 2 years ago

Try this simple command to test if the pytorch and cuda are installed successfully:

import torch
print(torch.__version__)
print(torch.version.cuda)

Did you train your own "checkpoint_512_celeba-hq.pt" since that it needs so much GPU memory as comment in [https://github.com/imlixinyang/HiSD/issues/12#issuecomment-840404898]() (i.e. at least 4x Tesla V100)?

The released checkpoint supports to use 512x512 image as input since it will automatically resize the image into corresponding resolution.

brian411227 commented 2 years ago

Sorry, but I can't see what the content is in this url https://github.com/imlixinyang/HiSD/issues/12#issuecomment-840404898 It said "No results matched your search."

The result is that torch.version = 1.0.1.post2 torch.version.cuda = 10.0.130

Now, I don't know that just only checkpoint_512 can output 512x512 image ? Another question is where can I config the output image size ? Thank you for your reply.

imlixinyang commented 2 years ago

Only checkpoint_512 which is trained with 512 resolution can output 512x512 image directly. If you want to use 256x256 checkpoint to output 512x512 image, you should upsample the output in the test code. Please ensure the original test code can run successfully first and then try your own modification.

brian411227 commented 2 years ago

Traceback (most recent call last): File "core/train.py", line 64, in G_adv, G_sty, G_rec, D_adv = trainer.update(x, y, i, j, j_trg) File "/home/HiSD/core/trainer.py", line 139, in update x_trg, x_cyc, s, s_trg = self.models((x, y, i, j, j_trg), mode='gen') File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "/home/HiSD/core/trainer.py", line 30, in forward return self.gen_losses(args) File "/home/HiSD/core/trainer.py", line 40, in gen_losses e = self.gen.encode(x) File "/home/HiSD/core/networks.py", line 126, in encode e = self.encoder(x) File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(input, kwargs) File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/anaconda3/envs/HiSD/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 320, in forward self.padding, self.dilation, self.groups) RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I run the origin test code which produce above message. I have no idea.

imlixinyang commented 2 years ago

It looks like that the reason is the python environment or packages. I recommend you to use conda environment (e.g., anaconda) and reinstall the cudatoolkit and pytorch following the repo.

imlixinyang / HiSD

Another size of image #28