About the image reshaping when preparing my own dataset

alexlee-gk / video_prediction

Stochastic Adversarial Video Prediction

https://alexlee-gk.github.io/video_prediction/

MIT License

303 stars 65 forks source link

About the image reshaping when preparing my own dataset #16

Closed happyday521 closed 5 years ago

happyday521 commented 5 years ago

Hi, Alex!

Thanks for releasing the code! I am having some trouble running the code. I want to run your code on my own dataset and I preprocess it into TFRecords files and define a class for it(based on kth_dataset.py) Besides, I need to change the image size to 128x128.

Can you tell me what and where should I modify when I prepare my own dataset (128x128, RGB) except what I have done? Need I modify the network architecture (such as layers) to adjust to my resolution(128x128)? Thanks very much! Look forward to your reply!

alexlee-gk commented 5 years ago

In the kth_dataset.py script, you can preprocess to 128x128 images by passing the flag --image_size 128. The kth script assumes that the images are grayscale so it only keeps one of the channels, which you might not want to do for your dataset.

The model should already work for 128x128 images, but it uses a lot of memory. I haven't tried much architectures for 128x128 resolution so you might want to adjust the layer specification so you end up using less memory.

happyday521 commented 5 years ago

Got it. Thanks!