Open groot-1313 opened 6 years ago
I experienced problems when expended a dataset 10 times : I was needed to retrain completely the network ! My expended data was not the same nature than my initial dataset.
May your dataset is sorted in a particuliar order ? I mean if you try to train A A A A A A A A A ... 10000 times then B B B B B B B B B ... 10000 times it won't work as if you learn A B A B A B A B A B ...
Otherwise, I always see a 100% training reduce with a few steps on my own sets (because we start from random values, it is really easy for the training to match something better).
You can try the tensorboard and what is call embeddings if you want to visualise your dataset.
My dataset is a video file. So there is gradual change in scenes. There should be an option to shuffle the dataset while training, no?
It is for now sorted by name :
# if the image names are numbers, sort by the value rather than asciibetically # having sorted inputs means that the outputs are sorted in test mode if all(get_name(path).isdigit() for path in input_paths): input_paths = sorted(input_paths, key=lambda path: int(get_name(path))) else: input_paths = sorted(input_paths)
You can try with random.shuffle instead !
I am confused, shuffle is already enable here :
with tf.name_scope("load_images"): path_queue = tf.train.string_input_producer(input_paths, shuffle=a.mode == "train") reader = tf.WholeFileReader() paths, contents = reader.read(path_queue) raw_input = decode(contents) raw_input = tf.image.convert_image_dtype(raw_input, dtype=tf.float32)
as a.mode == "train".
I got no more ideas than a bad source repartition in your dimensions. Does a more epochs change anything ?
Yes, I am unsure why!
I also want to know why? please.
The gradient algorithm use to correct the generator answer with a little step only.
With a step too high, you may miss the winning ticket.
Within 1 epoch, the answer for each image is corrected once with a little step.
Then it is better to learn with more epochs, step by step ;)
I have completed 2 epochs of training on a datatset which contains 21000 images. But my training loss has not reduced at all.
A small snippet:
The first epoch had similar losses. I know I should train it on a few more epochs, but with each epoch, the network is trained on 21000 images, which I believe should have caused a decrease in the loss. Any inputs on how to proceed will be very much appreciated!