Open marco-c opened 5 years ago
Class balance: Y: 53% D+N : 47%
Network - vgg16 Pretrained - none Image size: (48,48) Optimiser - sgd Epochs - 50 Test Set Accuracy - 84.54% N+D Prediction Precision: 83.33% Confusion matrix: [[215 50] [43 300]]
122658933ee9_19_38_2018_12_17.txt
Network - vgg16 Pretrained - none Image size: (48,48) Optimiser - rmsprop Epochs - 50 Test Set Accuracy - 82.57% N+D Prediction Precision: 84.78% Confusion matrix: [[195 70] [35 308]]
What exactly did you do in the pretrain.py case?
Also, could you try using rmsprop instead of sgd?
Also, could you try using rmsprop instead of sgd?
Actually, both, to compare exactly their results.
What exactly did you do in the pretrain.py case?
I set target_size=(48,48)
in utils.py/load_image
, then ran my notebook with !python3 pretrain.py -n=vgg16 -o=sgd
Actually, both, to compare exactly their results.
Will do.
I set target_size=(48,48) in utils.py/load_image, then ran my notebook with !python3 pretrain.py -n=vgg16 -o=sgd
OK. So, pretrain.py is meant to only pretrain the network. In theory, we should first pretrain with pretrain.py, then use the pretrained model to run train.py.
Ahh, I was thinking that this might be the case. Now it makes sense why there was no .txt file, etc. Do you have info on what is going on in pretrain.py; in particular the "slightly different problem (for which we know the solution)"?
This explains it all: https://github.com/marco-c/autowebcompat/blob/master/pretrain.py#L58
The goal of the classifier in the pretrain case is to detect when two screenshots belong to the same website.
This explains it all: https://github.com/marco-c/autowebcompat/blob/master/pretrain.py#L58
Indeed it does, thanks!
Class balance: Y: 53% D+N : 47%
Network - vgg16 Pretrained - imagenet Image size: (224,224) Optimiser - sgd Epochs - 50 Test Set Accuracy - 84.9% N+D Prediction Precision: 85.5% Confusion matrix: [[254 47] [43 264]]
tensorflow-1-vm_21_49_2018_12_20.txt
Network - vgg16 Pretrained - imagenet Image size: (224,224) Optimiser - rmsprop Epochs - 50 Test Set Accuracy - 80.9% N+D Prediction Precision: 81.6% Confusion matrix: [[222 73] [ 50 263]]
@marco-c I also performed hyper-parameter optimization with random search, and Hyperband. The best configuration found was via random search:
Network - vgg16
Pretrained - no
Test Set Accuracy - 80.1%
N+D Prediction Precision: 88.1%
Image size: (48,48)
Optimizer: Adam
Learning rate: 1.7e-5
Decay: 1e-6
Momentum: 0.74
Epsilon: 8.4e-8
fc1 L2 regularization strength: 2.32e-2
fc2 L2 regularization strength: 5.19e-3
fc1 dropout: 3.37e-7
fc2 dropout: 1.71e-7
Not bad! We should improve the labeling to have better/more precise results
We need to find some good options for the classifier to reach a baseline acceptable accuracy:
We can start with the RMSProp optimizer.