Open rohanb2018 opened 2 years ago
hey,
i just noticed that i havent pushed latest changes to this repo. notebooks are missing few significant changes.
just to add few points:
Thanks so much for the pointers! I will remove the fine-tuning, change the global model final layer, and double-check my train/val split and try to re-train the global model. Will let you know if I have any success with fully training the global model, or if I have any further questions.
Actually, I didn't fully understand the global model final layer change. After the Dense(1,input_shape=(100,100,64))
layer you suggested, wouldn't you still need a Dense(10000) at the end, because the final output of the network has to have 10000 units? (because the output is a distribution over all of the possible pen locations in the 100x100 grid)?
Just to show the current result from training the model, the training early stops after only 3 epochs (I guess because the validation loss is increasing). Here I reproduced the plot from the notebook (red = train loss, yellow = validation loss, green = train accuracy, blue = val accuracy). Anyway, I'm going to keep playing around with it (maybe increase the patience for the early-stopping).
Actually, I didn't fully understand the global model final layer change. After the
Dense(1,input_shape=(100,100,64))
layer you suggested, wouldn't you still need a Dense(10000) at the end, because the final output of the network has to have 10000 units? (because the output is a distribution over all of the possible pen locations in the 100x100 grid)?
Dense(1,input_shape=(100,100,64)) layer itself will result in 10000 output layer.
Just to show the current result from training the model, the training early stops after only 3 epochs (I guess because the validation loss is increasing). Here I reproduced the plot from the notebook (red = train loss, yellow = validation loss, green = train accuracy, blue = val accuracy). Anyway, I'm going to keep playing around with it (maybe increase the patience for the early-stopping).
can you specify values of accuracies and losses ?
Unfortunately I don't have the exact accuracies and losses for that particular run anymore, but I think the val accuracy was around 0.0015.
I was actually able to get some considerably improved stats (train loss = 0.9388, train accuracy = 0.9348, val loss = 3.5167, val accuracy = 0.6756) after making a couple of changes, including increasing the number of train/validation samples (to 500000/35000), slightly increasing the early stopping patience to 5, and adding L2 regularization in the final two dense layers (using the original architecture, didn't have a chance to incorporate the Dense fix you suggested).
Updated plot:
i am surprised to how you are able to train without final dense layer modified, because that is a lot of parameters to train, you sure do have access to some serious hardware 😄 and also to add, you still can get a lot of training samples from 90% of files.
Hello, I had two general questions about training the global model (using
global_model.ipynb
).First, I noticed that the fine-tuning section of the notebook calls the
inp_data_generator
method, which doesn't seem to be defined inglobal_model.ipynb
. In my code, I ended up switching to theDataGenerator
that is actually defined in the notebook. Was there a particular reason for callinginp_data_generator
inglobal_model.ipynb
?Second, I was curious what training hyperparameters were most useful for getting the global model to train successfully. I noticed in
global_model.ipynb
that the initial training phase runs for 2 epochs, followed by a fine-tuning phase that runs for 5 epochs. However, with these settings my final global model accuracy was only slightly above 0. Specifically, I was curious about the number of epochs as well as the training/validation set sizes that were most useful for successful training.Happy to provide more details about my model performance if that helps. Thanks!