Questions for run train.py

JiahuiYu / generative_inpainting

DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

http://jiahuiyu.com/deepfill/

Other

3.27k stars 787 forks source link

Questions for run train.py #41

Closed jiyoungAn closed 6 years ago

jiyoungAn commented 6 years ago

Hi, I am a beginner in tensorflow and I'm very interested in your paper. I'll be very grateful if you could answer my question.

run only using CPU I am trying to run train.py using CPU only. So I changed NUM_GPUS to 0 (inpaint.yml). However error related with reading gpu still has occurred and it stops after below sentence. 32m[2018-05-23 19:02:54 @weights_viewer.py:60][0m Total size of trainable weights: 0G 10M 184K 136B (Assuming32-bit data type.) Are there any settings that need to be changed or Is it impossible to run only using cpu?
Adding train image in existing model I'd like to add some training images(#800(shape 255,255,3)) in your place2 model. 1) downloading the place2 model you've built 2) locate the model into the model_logs file 3) train In this way, will my image be added to the existing model?

Regarding, jiyoungAn

JiahuiYu commented 6 years ago

Please remove lines here. It is highly recommend to have GPUs to train model, otherwise the progress will be very slow.
Yes, you images with random sampled masks will be used to fine-tune existing model. Make sure you do have validation images to track the training progress, otherwise it may be easily overfitting on you 800 images.

chenzhaiyu commented 6 years ago

Hi, I am able to run train.py using CPU only after removing the several code lines as you said.

But when I run test.py, it raises the Error reading GPU information, set no GPU.

What other changes should I make to run the test.py with your pretrained model still using CPU only?

Best Regards.

JiahuiYu commented 6 years ago

For testing, the similar lines should be removed in test.py.

jiyoungAn commented 6 years ago

thank you!!

KangSH9776 commented 6 years ago

thank you very much!!

I have additional questions.

1.downloading the place2 model you've built It's in the top question.

How and where to download place2 data?

jiyoungAn commented 6 years ago

You can download it in "pretrained models" section here. click dataset name. :)

KangSH9776 commented 6 years ago

Thank you for your answers.

I deleted and ran the code below in the file ' test.py '. ng.get_gpus(1)

The following error occurred : 2018-05-26 17:21:06.800877: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2018-05-26 17:21:06.800926: E tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM 2018-05-26 17:21:06.800939: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) Aborted (core dumped) What did i miss?ㅠㅠ

can I ask you one more question? I'd like to do training about place2 data, not pretrained models. How do I download place2 data for training? Thank you very much for your time.

jiyoungAn commented 6 years ago

I don't have much experience about tf, but If you write code like ng.get_gpus(1). It means that you will use #1 gpu. os.environ['CUDA_VISIBLE_DEVICES']='1' maybe you can check again your gpu number.
In inpaint.yml, there are address info in each data set.

JiahuiYu commented 6 years ago

If you only have one GPU, starts from 0. os.environ['CUDA_VISIBLE_DEVICES']='0'

KangSH9776 commented 6 years ago

Hello, I'm going to train 'place2 dataset', but I have a problem and leave a question.

I downloaded 'High-resolution images' as it was written. below 'Data of Places365-Standard '

The 'flist' file was downloaded from 'CelebA-HQ'.

I get the following 'cuda has no input image' error when executing 'python train.py '

I want to know how to do data when training 'place2'.

Thanks for reading

JiahuiYu commented 6 years ago

The path of files in flielist file maybe wrong. Please check. Please consider print the path inside data_from_fnames.py file.

KangSH9776 commented 6 years ago

where is data_from_fnames.py file?

hexia11 commented 5 years ago

Please remove lines here. It is highly recommend to have GPUs to train model, otherwise the progress will be very slow.

Yes, you images with random sampled masks will be used to fine-tune existing model. Make sure you do have validation images to track the training progress, otherwise it may be easily overfitting on you 800 images.

It still stops after show the sentence: [32m[2019-04-18 12:32:44 @weights_viewer.py:60][0m Total size of trainable wei ghts: 0G 10M 184K 136B (Assuming32-bit data type.) I have already removed the lines you said.How to fix this up?