Closed tohiddar closed 2 years ago
There appears to be a few places in the code that uses a random function to shuffle. One of the shuffle areas is the in the split_data_training_testing function in the data_prep.py module where it shuffles the image keys.
img_keys = list(img_to_cap_vector.keys())
random.shuffle(img_keys)
The code also uses a random image to compare its "Real Caption" to the predicted caption. This happens in the validation_set_captions function under the eval.py module.
rid = np.random.randint(0, len(img_name_val))
The code also uses a random function to calculate the predicted_id's for the categories of the caption words. I am not sure yet what this random function does here but it appears to pick the categories of the words that are used to create the caption (like is the word a noun, a verb, etc.) predicted_id = tf.random.categorical(predictions, 1)[0][0].numpy() If one prints the predicted_id's, they are integers that seem to correspond to various word categories.
Note: One thing to note is that the checkpoint files which are located here (checkpoints/train/) should be removed each time to avoid causing the a checkpointed solution to affect the next trial.
Summary: the two first random variables above, relate to and influence the training of the model. So my expectation was that eliminating those randomness's would eliminate randomness from the training. However, despite deactivating those, training the model resulted in a slightly different loss function each time. This likely could be attributed to the fact that the adams optimizer is a stochastic optimizer which means it is going to yield a different result each time. Therefore, any effort in eliminating the randomness of the model is not very useful. Instead the focus should be put on creating metrics to help us evaluate the quality of the captions produced.
Closing this issue. Similar work will be tracked in other ITS tickets where we will try to define and test the evaluation metrics.
Since the code shuffles the image database each time it learns, it makes it hard to do a proper parametric study. Find out how to prevent the code from shuffling images when it picks them from the database.