GlassyWing / nvae

An unofficial toy implementation for NVAE 《A Deep Hierarchical Variational Autoencoder》
Apache License 2.0
108 stars 21 forks source link

Can you write an another readme documation to show how to run this code using my own data? #1

Closed ganyibo closed 3 years ago

ganyibo commented 4 years ago

Can you write an another readme documation to show how to run this code using my own data? I will appreciate it.Thankes

GlassyWing commented 4 years ago

Can you write an another readme documation to show how to run this code using my own data? I will appreciate it.Thankes

Hi, you can train the model with your own data with command

python train.py --dataset_path <img_directory> --batch_size 128
Lukelluke commented 4 years ago

Can you write an another readme documation to show how to run this code using my own data? I will appreciate it.Thankes

Hi, you can train the model with your own data with command

python train.py --dataset_path <img_directory> --batch_size 128

hello, when i wanna change my data to 512*512 size, where should i modify ur implements. i tried change the '64' into '512' in train.py, however there occur error: Sizes of tensors must match except in dimension 1. Got 4 and 32 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71

could u please do me a favor

GlassyWing commented 4 years ago

The image will be reduced by 32 times, so the initial size should be 512/32 = 16, you need change the map_h and map_w to be 16 https://github.com/GlassyWing/nvae/blob/0fd7dd2e587adea0742ae09dd80f06a7e9a1a655/nvae/decoder.py#L139

Lukelluke commented 4 years ago

The image will be reduced by 32 times, so the initial size should be 512/32 = 16, you need change the map_h and map_w to be 16 https://github.com/GlassyWing/nvae/blob/0fd7dd2e587adea0742ae09dd80f06a7e9a1a655/nvae/decoder.py#L139

Thank U very much, thank u for teaching me , and I'm now running polished version at Separated Conv2d coding way . As: # nn.Conv2d(n_group * dim, n_group * dim, kernel_size=5, padding=2, groups=n_group), nn.Conv2d(n_group * dim, n_group * dim, kernel_size=5, stride=1, padding=2, groups=n_group * dim), nn.Conv2d(n_group * dim, n_group * dim, kernel_size=1, stride=1, padding=0, groups=1), in common.py file at 1024*1024 images. I will post u the result, hope it could reach original paper performance as much as possible.

Ps. BTW, how do think of why this implement couldn't perform well at 64*64 data? I tried read thorough code and original paper, maybe just in the part of "BatchNorm" ,there are several tricks mentioned in the paper i couldn't clearly understand how to implement, however this seems not the core secret of NVAE perfect performance.

Beside above mentioned, any part in this paper hasn't or couldn't be implemented in your opinion?

Looking forward to hear from you!

Sincerely, Luke Huang

Lukelluke commented 4 years ago

Well, I noticed that official implement had been released two days ago, good job!


0907Update:

官方版本的NVAE,我尝试用celeba64跑,死活跑不起来 (数据处理部分手动下载并手动挂载进去转为lmdb),但是训练步骤,已经把batch和模型尺寸参数都尽可能压缩了,也一直报 显卡设备问题 && 显存不足问题(双卡环境下单卡训练)

想请教老哥你跑起来了吗

请老哥多指教!